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To Paul Halmos 


In Memoriam 


Preface 


It seems to have been decided that undergraduate mathematics today rests 
on two foundations: calculus and linear algebra. These may not be the 
best foundations for, say, number theory or combinatorics, but they serve 
quite well for undergraduate analysis and several varieties of undergradu- 
ate algebra and geometry. The really perfect sequel to calculus and linear 
algebra, however, would be a blend of the two—a subject in which calcu- 
lus throws light on linear algebra and vice versa. Look no further! This 
perfect blend of calculus and linear algebra is Lie theory (named to honor 
the Norwegian mathematician Sophus Lie—pronounced “Lee ”). So why 
is Lie theory not a standard undergraduate topic? 


The problem is that, until recently, Lie theory was a subject for mature 
mathematicians or else a tool for chemists and physicists. There was no 
Lie theory for novice mathematicians. Only in the last few years have there 
been serious attempts to write Lie theory books for undergraduates. These 
books broke through to the undergraduate level by making some sensible 
compromises with generality; they stick to matrix groups and mainly to the 
classical ones, such as rotation groups of n-dimensional space. 


In this book I stick to similar subject matter. The classical groups 
are introduced via a study of rotations in two, three, and four dimensions, 
which is also an appropriate place to bring in complex numbers and quater- 
nions. From there it is only a short step to studying rotations in real, 
complex, and quaternion spaces of any dimension. In so doing, one has 
introduced the classical simple Lie groups, in their most geometric form, 
using only basic linear algebra. Then calculus intervenes to find the tan- 
gent spaces of the classical groups—their Lie algebras—and to move back 
and forth between the group and its algebra via the log and exponential 
functions. Again, the basics suffice: single-variable differentiation and the 
Taylor series for e* and log(1 +x). 


vil 
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Where my book diverges from the others is at the next level, the mirac- 
ulous level where one discovers that the (curved) structure of a Lie group is 
almost completely captured by the structure of its (flat) Lie algebra. At this 
level, the other books retain many traces of the sophisticated approach to 
Lie theory. For example, they rely on deep ideas from outside Lie theory, 
such as the inverse function theorem, existence theorems for ODEs, and 
representation theory. Even inside Lie theory, they depend on the Killing 
form and the whole root system machine to prove simplicity of the classical 
Lie algebras, and they use everything under the sun to prove the Campbell— 
Baker—Hausdorff theorem that lifts structure from the Lie algebra to the Lie 
group. But actually, proving simplicity of the classical Lie algebras can be 
done by basic matrix arithmetic, and there is an amazing elementary proof 
of Campbell—Baker—Hausdorff due to Eichler [1968]. 

The existence of these little-known elementary proofs convinced me 
that a naive approach to Lie theory is possible and desirable. The aim of 
this book is to carry it out—developing the central concepts and results of 
Lie theory by the simplest possible methods, mainly from single-variable 
calculus and linear algebra. Familiarity with elementary group theory is 
also desirable, but I provide a crash course on the basics of group theory in 
Sections 2.1 and 2.2. 

The naive approach to Lie theory is due to von Neumann [1929], and it 
is now possible to streamline it by using standard results of undergraduate 
mathematics, particularly the results of linear algebra. Of course, there is a 
downside to naiveté. It is probably not powerful enough to prove some of 
the results for which Lie theory is famous, such as the classification of the 
simple Lie algebras and the discovery of the five exceptional algebras.' To 
compensate for this lack of technical power, the end-of-chapter discussions 
introduce important results beyond those proved in the book, as part of an 
informal sketch of Lie theory and its history. It is also true that the naive 
methods do not afford the same insights as more sophisticated methods. 
But they offer another insight that is often undervalued—some important 
theorems are not as difficult as they look! I think that all mathematics 
students appreciate this kind of insight. 

In any case, my approach is not entirely naive. A certain amount of 
topology is essential, even in basic Lie theory, and in Chapter 8 I take 


'T say so from painful experience, having entered Lie theory with the aim of under- 
standing the exceptional groups. My opinion now is that the Lie theory that precedes the 
classification is a book in itself. 
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the opportunity to develop all the appropriate concepts from scratch. This 
includes everything from open and closed sets to simple connectedness, so 
the book contains in effect a minicourse on topology, with the rich class 
of multidimensional examples that Lie theory provides. Readers already 
familiar with topology can probably skip this chapter, or simply skim it to 
see how Lie theory influences the subject. (Also, if time does not permit 
covering the whole book, then the end of Chapter 7 is a good place to stop.) 

I am indebted to Wendy Baratta, Simon Goberstein, Brian Hall, Ro- 
han Hewson, Chris Hough, Nathan Jolly, David Kramer, Jonathan Lough, 
Michael Sun, Marc Ryser, Abe Shenitzer, Paul Stanford, Fan Wu and the 
anonymous referees for many corrections and comments. As usual, my 
wife, Elaine, served as first proofreader; my son Robert also served as the 
model for Figure 8.7. Thanks go to Monash University for the opportunity 
to teach courses from which this book has grown, and to the University of 
San Francisco for support while writing it. 

Finally, a word about my title. Readers of a certain age will remember 
the book Naive Set Theory by Paul Halmos—a lean and lively volume 
covering the parts of set theory that all mathematicians ought to know. 
Paul Halmos (1916-2006) was my mentor in mathematical writing, and I 
dedicate this book to his memory. While not attempting to emulate his style 
(which is inimitable), I hope that Naive Lie Theory can serve as a similar 
introduction to Lie groups and Lie algebras. Lie theory today has become 
the subject that all mathematicians ought to know something about, so I 
believe the time has come for a naive, but mathematical, approach. 


John Stillwell 
University of San Francisco, December 2007 
Monash University, February 2008 
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Geometry of complex 
numbers and quaternions 


PREVIEW 


When the plane is viewed as the plane C of complex numbers, rotation 
about O through angle @ is the same as multiplication by the number 


9 =cos@+isin@. 


The set of all such numbers is the unit circle or 1-dimensional sphere 
s'= {z5 [z|=1}. 


Thus S! is not only a geometric object, but also an algebraic structure; 
in this case a group, under the operation of complex number multiplication. 
Moreover, the multiplication operation e’® . e = e(%+%) and the inverse 
operation (e!°)—! = e’ a) depend smoothly on the parameter 0. This 
makes S! an example of what we call a Lie group. 

However, in some respects S! is too special to be a good illustration of 
Lie theory. The group S! is 1-dimensional and commutative, because mul- 
tiplication of complex numbers is commutative. This property of complex 
numbers makes the Lie theory of S! trivial in many ways. 

To obtain a more interesting Lie group, we define the four-dimensional 
algebra of quaternions and the three-dimensional sphere S? of unit quater- 
nions. Under quaternion multiplication, S° is a noncommutative Lie group 
known as SU(2), closely related to the group of space rotations. 


J. Stillwell, Naive Lie Theory, DOI: 10.1007/978-0-387-78214-0_1, 1 
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1.1 Rotations of the plane 


A rotation of the plane IR? about the origin O through angle @ is a linear 
transformation Rg that sends the basis vectors (1,0) and (0,1) to (cos @, 
sin @) and (—sin @,cos @), respectively (Figure 1.1). 


(cos 6, sin 8) 


(— sin 0,cos @) 


Figure 1.1: Rotation of the plane through angle @. 


It follows by linearity that Rg sends the general vector 
(x,y) =x(1,0)+y(0,1) to (xcos@—ysin@, xsin@ + ycos@), 


and that Rg is represented by the matrix 
cos@ —sin@ 
sin cos@ }° 


We also call this matrix Rg. Then applying the rotation to (x, y) is the same 
as multiplying the column vector (3) on the left by matrix Rg, because 


Ra (*\= cos@ —sin@\ /x\ _ (xcos@—ysin@ 
°\y)~ \sin@ cos@ )\y) \xsin@+ycosé ) ’ 
Since we apply matrices from the left, applying R» then Rg is the same 
as applying the product matrix ReRg. (Admittedly, this matrix happens 


to equal RgRg because both equal Reig. But when we come to space 
rotations the order of the matrices will be important.) 
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Thus we can represent the geometric operation of combining succes- 
sive rotations by the algebraic operation of multiplying matrices. The main 
aim of this book is to generalize this idea, that is, to study groups of linear 
transformations by representing them as matrix groups. For the moment 
one can view a matrix group as a set of matrices that includes, along with 
any two members A and B, the matrices AB, A~!, and B~!. Later (in Sec- 
tion 7.2) we impose an extra condition that ensures “smoothness” of matrix 
groups, but the precise meaning of smoothness need not be considered yet. 
For those who cannot wait to see a definition, we give one in the subsection 
below—but be warned that its meaning will not become completely clear 
until Chapters 7 and 8. 

The matrices Rg, for all angles 6, form a group called the special or- 
thogonal group SO(2). The reason for calling rotations “orthogonal trans- 
formations” will emerge in Chapter 3, where we generalize the idea of 
rotation to the n-dimensional space R” and define a group SO(n) for each 
dimension n. In this chapter we are concerned mainly with the groups 
SO(2) and SO(3), which are typical in some ways, but also exceptional 
in having an alternative description in terms of higher-dimensional “num- 
bers.” 

Each rotation Rg of R* can be represented by the complex number 


Z@ =cos@+isin@ 
because if we multiply an arbitrary point (x,y) = x + iy by ze we get 
zo(x+iy) = (cos @ + isin@)(x+ iy) 

= xcos 8 — ysin@ + i(xsin @ + ycos 8) 

= (xcos 8 — ysin@, xsin@ + ycos 8), 
which is the result of rotating (x,y) through angle 8. Moreover, the ordi- 
nary product zgZ@ represents the result of combining Rg and Rg. 

Rotations of R* and R* can be represented, in a slightly more compli- 

cated way, by four-dimensional “numbers” called quaternions. We intro- 
duce quaternions in Section 1.3 via certain 2 x 2 complex matrices, and to 


pave the way for them we first investigate the relation between complex 
numbers and 2 x 2 real matrices in Section 1.2. 


What is a Lie group? 


The most general definition of a Lie group G is a group that is also a smooth 
manifold. That is, the group “product” and “inverse” operations are smooth 
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functions on the manifold G. For readers not familiar with groups we give 
a crash course in Section 2.1, but we are not going to define smooth mani- 
folds in this book, because we are not going to study general Lie groups. 

Instead we are going to study matrix Lie groups, which include most 
of the interesting Lie groups but are much easier to handle. A matrix 
Lie group is a set of n x n matrices (for some fixed n) that is closed un- 
der products, inverses, and nonsingular limits. The third closure condition 
means that if A,,A2,A3,... is a convergent sequence of matrices in G, and 
A = limg_... Ax has an inverse, then A is in G. We say more about the limit 
concept for matrices in Section 4.5, but for n x n real matrices it is just the 
limit concept in R”. 

We can view all matrix Lie groups as groups of real matrices, but it is 
natural to allow the matrix entries to be complex numbers or quaternions 
as well. Real entries suffice in principle because complex numbers and 
quaternions can themselves be represented by real matrices (see Sections 
1.2 and 1.3). 

It is perhaps surprising that closure under nonsingular limits is equiv- 
alent to smoothness for matrix groups. Since we avoid the general con- 
cept of smoothness, we cannot fully explain why closed matrix groups are 
“smooth” in the technical sense. However, in Chapter 7 we will construct 
a tangent space T(G) for any matrix Lie group G from tangent vectors 
to smooth paths in G. We find the tangent vectors using only elementary 
single-variable calculus, and it can also be shown that the space 7;(G) has 
the same dimension as G. Thus G is “smooth” in the sense that it has a 
tangent space, of the appropriate dimension, at each point. 


Exercises 


Since rotation through angle 0 + @ is the result of rotating through @, then rotating 
through @, we can derive formulas for sin(@ + @) and cos(@ + @) in terms of sin, 
sin @, cos @, and cos@. 


1.1.1 Explain, by interpreting z9+, in two different ways, why 
cos(@ + @) + isin(@ + —) = (cos @ + isin@)(cos p+ ising). 
Deduce that 


sin(@ + @) = sinOcos@+cos@ sing, 


cos(@ + @) = cos @cos@ — sin@ sing. 


1.1.2 Deduce formulas for sin2@ and cos26@ from the formulas in Exercise 1.1.1. 
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1.1.3 Also deduce, from Exercise 1.1.1, that 


_ tan@+tan@ 
alee) 1—tan@ tang’ 


1.1.4 Using Exercise 1.1.3, or otherwise, write down the formula for tan(@ — ), 
and deduce that lines through O at angles @ and @ are perpendicular if and 
only if tan@ = —1/tang@. 


1.1.5 Write down the complex number z_¢ and the inverse of the matrix for rota- 
tion through @, and verify that they correspond. 


1.2 Matrix representation of complex numbers 


A good way to see why the matrices Rg = ees eg behave the same 


as the complex numbers zg = cos @ + isin@ is to write Rg as the linear 


combination 
1 0 : 0 -l 
Re = 0080 ( 1) sino ({ 7 


of the basis matrices 


jase) 2 gf 0 1 

OY Ly? “St Oey° 
It is easily checked that 

V=1, lizil=i, ? =-1, 


so the matrices 1 and i behave exactly the same as the complex numbers 1 
and i. 
In fact, the matrices 


(5 Bs =al+bi, where a,beER, 
behave exactly the same as the complex numbers a+ bi under addition 
and multiplication, so we can represent all complex numbers by 2 x 2 real 
matrices, not just the complex numbers zg that represent rotations. This 
representation offers a “linear algebra explanation” of certain properties of 
complex numbers, for example: 


e The squared absolute value, |a+bi|? =a? +b? of the complex num- 


ber a+ bi is the determinant of the corresponding matrix ( me ). 
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e Therefore, the multiplicative property of absolute value, |z,z2| = 
\z1||z2|, follows from the multiplicative property of determinants, 


det(A;A2) — det(A1) det(A2) : 


(Take A, as the matrix representing z,, and A as the matrix repre- 
senting Z.) 


e The inverse z | = aa of z=a+bi £0 corresponds to the inverse 


matrix ‘ 
a —-b\ 1 a b 
boa — a@t+b2\—b al’ 


The two-square identity 


If we set z}] = a, + ib, and z2 = a2 +ib2, then the multiplicative property 
of (squared) absolute value states that 


(aj + bt) (a3 +5) = (ayaz — bbz)? + (arb +.a2b1)’, 


as can be checked by working out the product z;z2 and its squared abso- 
lute value. This identity is particularly interesting in the case of integers 
a1,b,a2,b2, because it says that 


(a sum of two squares) x (a sum of two squares) = (a sum of two squares). 


This fact was noticed nearly 2000 years ago by Diophantus, who men- 
tioned an instance of it in Book III, Problem 19, of his Arithmetica. How- 
ever, Diophantus said nothing about sums of three squares—with good rea- 
son, because there is no such three-square identity. For example 


(1? + 1? + 17)(0? + 17 +27) =3 x 5 =15, 


and 15 is not a sum of three integer squares. 

This is an early warning sign that there are no three-dimensional num- 
bers. In fact, there are no n-dimensional numbers for any n > 2; however, 
there is a “near miss” for n = 4. One can define “addition” and “multipli- 
cation” for quadruples g = (a,b,c,d) of real numbers so as to satisfy all 
the basic laws of arithmetic except gig2 = g2q (the commutative law of 
multiplication). This system of arithmetic for quadruples is the guaternion 
algebra that we introduce in the next section. 
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Exercises 


1.2.1 Derive the two-square identity from the multiplicative property of det. 


1.2.2 Write 5 and 13 as sums of two squares, and hence express 65 as a sum of 
two squares using the two-square identity. 


1.2.3 Using the two-square identity, express 377 and 374 as sums of two nonzero 
squares. 


The absolute value |z| = Va? +b? represents the distance of z from O, and 
more generally, |u — v| represents the distance between u and v. When combined 
with the distributive law, 


u(v—w) =uv—uw, 
a geometric property of multiplication comes to light. 


1.2.4 Deduce, from the distributive law and multiplicative absolute value, that 
luv — uw| = |u||v—w}. 


Explain why this says that multiplication of the whole plane of complex 
numbers by u multiplies all distances by |u|. 


1.2.5 Deduce from Exercise 1.2.4 that multiplication of the whole plane of com- 
plex numbers by cos @ + isin @ leaves all distances unchanged. 


A map that leaves all distances unchanged is called an isometry (from the 
Greek for “same measure’’), so multiplication by cos @ + isin @ is an isometry of 
the plane. (In Section 1.1 we defined the corresponding rotation map Rg as a linear 
map that moves 1 and i in a certain way; it is not obvious from this definition that 
a rotation is an isometry.) 


1.33 Quaternions 


By associating the ordered pair (a,b) with the complex number a+ ib or the 
matrix ( ey ) we can speak of the “sum,” “product,” and “absolute value” 
of ordered pairs. In the same way, we can speak of the “sum,” “product,” 
and “absolute value” of ordered quadruples by associating each ordered 


quadruple (a,b,c,d) of real numbers with the matrix 
{ead =b=ic . 
a= (Fk a ©) 


We call any matrix of the form (*) a quaternion. (This is not the only 
way to associate a matrix with a quadruple. I have chosen these complex 


99 66. 
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matrices because they extend the real matrices used in the previous sec- 
tion to represent complex numbers. Thus complex numbers are the special 
quaternions with c= d= 0.) 

It is clear that the sum of any two matrices of the form (*) is another 
matrix of the same form, and it can be checked (Exercise 1.3.2) that the 
product of two matrices of the form (*) is of the form (*). Thus we can 
define the sum and product of quaternions to be just the matrix sum and 
product. Also, if the squared absolute value |q|? of a quaternion q is de- 
fined to be the determinant of g, then we have 


a+id —b—ic 


= a8 4 G24 2 3D 
detq=ae ($7 oe +b°+c°+d*. 


So |q|* is the squared distance of the point (a,b,c,d) from O in R*. 
The quaternion sum operation has the same basic properties as addition 
for numbers, namely 


a+gQ=arn, (commutative law) 

qi t+ (42 +493) = (qi +42) +43, (associative law) 
q+(—q)=90 where 0 is the zero matrix, (inverse law) 
q+0=¢. (identity law) 


The quaternion product operation does not have all the properties of 
multiplication of numbers—in general, the commutative property gig2 = 
q2q\ fails—but well-known properties of the matrix product imply the fol- 
lowing properties of the quaternion product: 


qi(9293) = (4192)43, (associative law) 
qq.'=1 forg40, (inverse law) 
ql=q, (identity law) 
qi(q2+43) =192+N93- (left distributive law) 


Here 0 and 1 denote the 2 x 2 zero and identity matrices, which are also 
quaternions. The right distributive law (q2 + 943)q1 = 9241 +. 93q1 Of course 
holds too, and is distinct from the left distributive law because of the non- 
commutative product. 

The noncommutative nature of the quaternion product is exposed more 
clearly when we write 


Ge —b—ci 


aa ar =al+bi+cj+dk, 
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where 
LO) «= (0-1 . {0 = _fi 
1=(6 1) Ge) IHC oO) Heo 4). 
Thus 1 behaves like the number 1, i? = —1 as before, and also r —k? = 


—1. The noncommutativity is concentrated in the products of i, j, k, which 
are summarized in Figure 1.2. The product of any two distinct elements is 


Figure 1.2: Products of the imaginary quaternion units. 


the third element in the circle, with a + sign if an arrow points from the 
first element to the second, and a — sign otherwise. For example, ij = k, 
but ji = —k, so ij F ji. 

The failure of the commutative law is actually a good thing, because it 
enables quaternions to represent other things that do not commute, such as 
rotations in three and four dimensions. 

As with complex numbers, there is a linear algebra explanation of some 
less obvious properties of quaternion multiplication. 


e The absolute value has the multiplicative property |qiq2| = |q1\|q2\, 
by the multiplicative property of det: det(qiq2) = det(q;) det(qz2). 


e Each nonzero quaternion q has an inverse q~! 


inverse of q. 


, namely the matrix 


e From the matrix (*) for g we get an explicit formula for g~!. If 
q=al-+bi+cj+dk 40 then 


4 1 


eae 


q 


e The quaternion al — bi— cj — dk is called the quaternion conjugate 
g of g=al+ bi+cj+dk, and we have qq =a*+b*+c?+d? =|q)’. 
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e The quaternion conjugate is not the result of taking the complex con- 
jugate of each entry in the matrix gq. In fact, g is the result of taking 
the complex conjugate of each entry in the transposed matrix q'. 
Then it follows from (qig2)' = gig} that (qiq2) =H q- 


The algebra of quaternions was discovered by Hamilton in 1843, and 
it is denoted by H in his honor. He started with just i and j (hoping to 
find an algebra of triples analogous to the complex algebra of pairs), but 
later introduced k = ij to escape from apparently intractable problems with 
triples (he did not know, at first, that there is no three-square identity). The 
matrix representation was discovered in 1858, by Cayley. 


The 3-sphere of unit quaternions 


The quaternions al + bi+ cj+dk of absolute value 1, or unit quaternions, 
satisfy the equation 

+B? ++’ =1. 
Hence they form the analogue of the sphere, called the 3-sphere S°, in the 
space R* of all 4-tuples (a,b,c,d). It follows from the multiplicative prop- 
erty and the formula for inverses above that the product of unit quaternions 
is again a unit quaternion, and hence S? is a group under quaternion mul- 
tiplication. Like the 1-sphere S! of unit complex numbers, the 3-sphere 
of unit quaternions encapsulates a group of rotations, though not quite so 
directly. In the next two sections we show how unit quaternions may be 
used to represent rotations of ordinary space R?. 


Exercises 


When Hamilton discovered H he described quaternion multiplication very con- 
cisely by the relations 
?=j =k’ =ijk=-1. 
1.3.1 Verify that Hamilton’s relations hold for the matrices 1, i, j, and k. Also 
show (assuming associativity and inverses) that these relations imply all 
the products of i, j, and k shown in Figure 1.2. 


1.3.2 Verify that the product of quaternions is indeed a quaternion. (Hint: It helps 
to write each quaternion in the form 


G2) 


where &@ = x — iy is the complex conjugate of & = x + iy.) 
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1.3.3 Check that 7 is the result of taking the complex conjugate of each entry in 
q', and hence show that 9j7g2 = 92 71 for any quaternions q; and qp. 


1.3.4 Also check that gq = |q|*. 


Cayley’s matrix representation makes it easy (in principle) to derive an amaz- 
ing algebraic identity. 


1.3.5 Show that the multiplicative property of determinants gives the complex 
two-square identity (discovered by Gauss around 1820) 


(joer |? + |B: |?) (Jo~2|? + |B2|”) = |or1 a — Bi Bo|* + |a Bo + Bi OHI’. 


1.3.6 Show that the multiplicative property of determinants gives the real four- 
square identity 


(aj + bt +c} +i) (a5 +b3+.¢5+d3) = (ajay — bbz — cc2 — dda 


( 
+ (aybz + bya + cz — dicp 
+ (aC — bydz + cyaz + dy bo 
Pi 


ajdz + bic2 — cjbz + daz 


This identity was discovered by Euler in 1748, nearly 100 years before the dis- 
covery of quaternions! Like Diophantus, he was interested in the case of integer 
squares, in which case the identity says that 


(a sum of four squares) x (a sum of four squares) = (a sum of four squares). 


This was the first step toward proving the theorem that every positive integer is 
the sum of four integer squares. The proof was completed by Lagrange in 1770. 


1.3.7 Express 97 and 99 as sums of four squares. 


1.3.8 Using Exercise 1.3.6, or otherwise, express 97 x 99 as a sum of four squares. 


1.4 Consequences of multiplicative absolute value 


The multiplicative absolute value, for both complex numbers and quater- 
nions, first appeared in number theory as a property of sums of squares. It 
was noticed only later that it has geometric implications, relating multipli- 
cation to rigid motions of R’, R°, and R*. Suppose first that u is a complex 
number of absolute value 1. Without any computation with cos @ and sin 0, 
we can see that multiplication of C = R? by wu is a rotation of the plane as 
follows. 
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Let v and w be any two complex numbers, and consider their images, 
uv and uw under multiplication by u. Then we have 


distance from uv to uw = |uv— uw| 
=|u(v—w)| _ by the distributive law 
=|u\|v—w| by multiplicative absolute value 
=|v—w|_ because |u| = 1 


= distance from v to w. 


In other words, multiplication by u with |u| = 1 is a rigid motion, also 
known as an isometry, of the plane. Moreover, this isometry leaves O 
fixed, because u x 0 = 0. And if u 4 1, no other point v is fixed, because 
uv = v implies u = 1. The only motion of the plane with these properties 
is rotation about O. 

Exactly the same argument applies to quaternion multiplication, at least 
as far as preservation of distance is concerned: if we multiply the space 
IR* of quaternions by a quaternion of absolute value 1, then the result is 
an isometry of R* that leaves the origin fixed. It is in fact reasonable to 
interpret this isometry of R* as a “rotation,” but first we want to show that 
quaternion multiplication also gives a way to study rotations of R*. To see 
how, we look at a natural three-dimensional subspace of the quaternions. 


Pure imaginary quaternions 
The pure imaginary quaternions are those of the form 
p= bi+cj+ dk. 


They form a three-dimensional space that we will denote by Ri+ Rj+ Rk, 
or sometimes R? for short. The space Ri+ Rj + Rk is the orthogonal 
complement to the line R1 of quaternions of the form a1, which we will 
call real quaternions. From now on we write the real quaternion a1 simply 
as a, and denote the line of real quaternions simply by R. 

It is clear that the sum of any two members of Ri+ Rj+ Rk is itself 
a member of IRi+ Rj-+ Rk, but this is not generally true of products. In 
fact, if u = wji+ u2j + u3k and v = vji+ v2j + v3k then the multiplication 
diagram for i, j, and k (Figure 1.2) gives 


uv = — (uyvy + u2Vv2 +33) 


+ (u2V3 — u3V2)i — (uyVv3 — u3V1)j + (up v2 — U2V1 )k. 
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This relates the quaternion product wv to two other products on R° that are 
well known in linear algebra: the inner (or “scalar” or “dot’’) product, 


U-V=UyVy + U2V2 + 0393, 


and the vector (or “‘cross’”) product 


i jk 
uUXV= |, U2 U3| = (U2Vv3 — U3V2)i— (ujVv3 — u3V1)j + (U1 v2 — U2V1)k. 
Vi %vW2 V3 


In terms of the scalar and vector products, the quaternion product is 
uv = —Uu-V+UXv. 


Since u-v is areal number, this formula shows that uv is in Ri+ Rj + Rk 
only if u-v = 0, that is, only if u is orthogonal to v. 

The formula uv = —u-v-+u Xv also shows that uv is real if and only 
if ux v= 0, that is, if u and v have the same (or opposite) direction. In 
particular, ifu € Ri+ Rj+ Rk and |u| = 1 then 


uw u-u |u|? |e 


Thus every unit vector in Ri+ Rj-+ Rk is a “square root of —1.” (This, by 
the way, is another sign that HI does not satisfy all the usual laws of algebra. 
If it did, the equation uv? = —1 would have at most two solutions.) 


Exercises 


The cross product is an operation on Ri+ Rj-+ Rk because u x vis in Ri+ Rj+Rk 
for any u,v € Ri+ Rj + Rk. However, it is neither a commutative nor associative 
operation, as Exercises 1.4.1 and 1.4.3 show. 


1.4.1 Prove the antisymmetric property u x v= —v X u. 
1.4.2 Prove that u x (v x w) = v(u- w) — w(u- v) for pure imaginary u,v, w. 
1.4.3 Deduce from Exercise 1.4.2 that x is not associative. 


1.4.4 Also deduce the Jacobi identity for the cross product: 


ux (vxw)+wx (ux v)+vx (wx u)=0. 


The antisymmetric and Jacobi properties show that the cross product is not com- 
pletely lawless. These properties define what we later call a Lie algebra. 
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1.5 Quaternion representation of space rotations 


A quaternion ¢ of absolute value 1, like a complex number of absolute value 
1, has a “real part” cos @ and an “imaginary part” of absolute value sin 0, 
orthogonal to the real part and hence in Ri+ Rj-+ Rk. This means that 


t=cos@+usin90, 


where u is a unit vector in Ri+Rj+ Rk, and hence u* = —1 by the remark 
at the end of the previous section. 

Such a unit quaternion ¢ induces a rotation of Ri+ Rj-+ Rk, though 
not simply by multiplication, since the product of tf and a member g of 
Ri+ Rj+ Rk may not belong to Ri+ Rj+ Rk. Instead, we send each 
q € Ri+ Rj+ Rk tot” 'gt, which turns out to be a member of Ri+ Rj+Rk. 

To see why, first note that 


t-! —7/|t|? = cos 0 —usinO, 


by the formulas for g~! and 7 in Section 1.3. 


Since t~! exists, multiplication of HI on either side by ft or t~! is an 
invertible map and hence a bijection of H onto itself. It follows that the 
map q+>t~!qt, called conjugation by t, is a bijection of H. Conjugation by 
t also maps the real line R onto itself, because t~'rt = r for a real number 
r; hence it also maps the orthogonal complement Ri-+ Rj-+ Rk onto itself. 
This is because conjugation by f is an isometry, since multiplication on 
either side by a unit quaternion is an isometry. 

It looks as though we are onto something with conjugation by f = 
cos 8 + usin @, and indeed we have the following theorem. 


Rotation by conjugation. /ft = cos @+usin@, where u € Ri+ Rj+ Rk 
is a unit vector, then conjugation by t rotates Ri+ Rj+ Rk through angle 
20 about axis u. 


Proof. First, observe that the line Ru of real multiples of u is fixed by the 
conjugation map, because 


t~'ut = (cos @ — usin @)u(cos @ + usin @) 
= (ucos 0 — u* sin®@) (cos 6 + usin @) 
=(ucos@+sin@)(cos@+usin@) because u? = —1 
= u(cos” 6 + sin* 8) + sin @ cos @ + u’ sin @ cos @ 


=u also because uw? = —1. 
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It follows, since conjugation by f is an isometry of Ri+ Rj-+ Rk, that 
its restriction to the plane through O in Ri+ Rj+ Rk orthogonal to the line 
Ru is also an isometry. And if the restriction to this plane is a rotation, then 
conjugation by f is a rotation of the whole space Ri+ Rj + Rk. 

To see whether this is indeed the case, choose a unit vector v orthogonal 
to u in Ri+ Rj+ Rk, so u-v=0. Then let w =u x v, which equals uv 
because u-v = 0, so {u,v,w} is an orthonormal basis of Ri+ Rj-+ Rk with 
uv = Ww, vw = u, Wu = v, uv = —vu and so on. It remains to show that 


t- ‘vt =vcos20—wsin20, t !wt =vsin2@+wcos20, 


because this means that conjugation by f rotates the basis vectors v and w, 
and hence the whole plane orthogonal to the line Ru, through angle 26. 
This is confirmed by the following computation: 


t~'vt = (cos @ — usin @)v(cos 6 + usin @) 
= (vcos 6 — uvsin @)(cos 6 + usin 8) 
= vcos” 6 — uvsin @ cos 6 + vusin 0 cos @ — uvusin? 0 
= vos” @ — 2uvsin@cos @+u7vsin?@ because vu = —uv 


2 


= v(cos? 6 — sin? @) —2wsin@cos@ because u? = —1, uv = w 


= vcos20 — wsin20. 


A similar computation (try it) shows that t~'wt = vsin2@ + wcos20, as 
required. 


This theorem shows that every rotation of R°, given by an axis u and 
angle of rotation a, is the result of conjugation by the unit quaternion 


j = te . a 
=cos— +usin—=. 
2 2 


The same rotation is induced by —t, since (—t)~'s(—t) = t~!st. But +1 
are the only unit quaternions that induce this rotation, because each unit 
quaternion is uniquely expressible in the form t = cos $ + usin $, and the 
rotation is uniquely determined by the two (axis, angle) pairs (u,a@) and 
(—u,—a). The quaternions ¢ and —t are said to be antipodal, because they 
represent diametrically opposite points on the 3-sphere of unit quaternions. 
Thus the theorem says that rotations of R° correspond to antipodal 
pairs of unit quaternions. We also have the following important corollary. 
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Rotations form a group. The product of rotations is a rotation, and the 
inverse of a rotation is a rotation. 


Proof. The inverse of the rotation about axis u through angle a is obviously 
a rotation, namely, the rotation about axis u through angle —q@. 

It is not obvious what the product of two rotations is, but we can show 
as follows that it has an axis and angle of rotation, and hence is a rotation. 
Suppose we are given a rotation r; with axis uw; and angle 0, and a rotation 
r2 With axis uz and angle Q. Then 


a a 
r, is induced by conjugation by t; = cos = + uy; sin = 
and 


me : : 104 . Oo 
r2 is induced by conjugation by f2 = cos z + uz sin a 


hence the result r,r2 of doing r;, then r2, is induced by 


qt ty (ty ati) = (tir) 'q(tita), 


which is conjugation by ¢\f2 =f. The quaternion f is also a unit quaternion, 
sO 


, a rane a 
= cos — + usin — 
2 2 


for some unit imaginary quaternion u and angle a. Thus the product rota- 
tion is the rotation about axis u through angle a. 


The proof shows that the axis and angle of the product rotation r,r2 can 
in principle be found from those of 7; and rz by quaternion multiplication. 
They may also be described geometrically, by the alternative proof of the 
group property given in the exercises below. 


Exercises 


The following exercises introduce a small fragment of the geometry of isometries: 
that any rotation of the plane or space is a product of two reflections. We begin 
with the simplest case: representing rotation of the plane about O through angle 
6 as the product of reflections in two lines through O. 

If # is any line in the plane, then reflection in @ is the transformation of the 
plane that sends each point S to the point S’ such that SS’ is orthogonal to Y and 
£ is equidistant from S and S’. 
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Figure 1.3: Reflection of S and its angle. 


1.5.1 If @ passes through P, and if S lies on one side of 7 at angle ~ (Figure 1.3), 
show that S’ lies on the other side of Y at angle q, and that |PS| = |PS’|. 


1.5.2 Deduce, from Exercise 1.5.1 or otherwise, that the rotation about P through 


angle @ is the result of reflections in any two lines through P that meet at 
angle 0/2. 


1.5.3 Deduce, from Exercise 1.5.2 or otherwise, that if &, ./@, and -% are lines 
situated as shown in Figure 1.4, then the result of rotation about P through 
angle 0, followed by rotation about Q through angle @, is rotation about R 
through angle y (with rotations in the senses indicated by the arrows). 


Q 


Figure 1.4: Three lines and three rotations. 


1.5.4 If # and .¥ are parallel, so R does not exist, what isometry is the result of 
the rotations about P and Q? 


Now we extend these ideas to R*. A rotation about a line through O (called 
the axis of rotation) is the product of reflections in planes through O that meet 
along the axis. To make the reflections easier to visualize, we do not draw the 
planes, but only their intersections with the unit sphere (see Figure 1.5). 

These intersections are curves called great circles, and reflection in a great 
circle is the restriction to the sphere of reflection in a plane through O. 
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Figure 1.5: Reflections in great circles on the sphere. 


1.5.5 Adapt the argument of Exercise 1.5.3 to great circles 2, .@, and-WY shown 
in Figure 1.5. What is the conclusion? 


1.5.6 Explain why there is no exceptional case analogous to Exercise 1.5.4. De- 
duce that the product of any two rotations of R* about O is another rotation 
about O, and explain how to find the axis of the product rotation. 


The idea of representing isometries as products of reflections is also useful in 
higher dimensions. We use this idea again in Section 2.4, where we show that any 
isometry of IR” that fixes O is the product of at most n reflections in hyperplanes 
through O. 


1.6 Discussion 


The geometric properties of complex numbers were discovered long before 
the complex numbers themselves. Diophantus (already mentioned in Sec- 
tion 1.2) was aware of the two-square identity, and indeed he associated a 
sum of two squares, a? +b?, with the right-angled triangle with perpendicu- 
lar sides a and b. Thus, Diophantus was vaguely aware of two-dimensional 
objects (right-angled triangles) with a multiplicative property (of their hy- 
potenuses). Around 1590, Viéte noticed that the Diophantus “product” 
of triangles with sides (a,b) and (c,d)—namely, the triangle with sides 
(ac — bd, bc + ad)—also has an additive property, of angles (Figure 1.6). 
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bc+ad 


a (a ac — bd 


Figure 1.6: The Diophantus “product” of triangles. 


The algebra of complex numbers emerged from the study of polyno- 
mial equations in the sixteenth century, particularly the solution of cu- 
bic equations by the Italian mathematicians del Ferro, Tartaglia, Cardano, 
and Bombelli. Complex numbers were not required for the solution of 
quadratic equations, because in the sixteenth century one could say that 
x?+1=0, for example, has no solution. The formal solution x = y= 1 
was just a signal that no solution really exists. Cubic equations force the 
issue because the equation x* = px+q has solution 


(the “Cardano formula”). Thus, according to the Cardano formula the so- 
lution of x° = 15x+4 is 


= 4) Fe 2? — if Eh 2- = W724 1li+ V2—-1li. 


But the symbol i = /—1 cannot be signaling NO SOLUTION here, because 


there is an obvious solution x = 4. How can \/2+ 11i+ \/2—11i be the 
solution when 4 is? 


In 1572, Bombelli resolved this conflict, and launched the algebra of 
complex numbers, by observing that 


(2477 =2411;, (—)? =2—-11i, 
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and therefore 
/24-1h 4/21 = (249)4+-0—) =4, 


assuming that i obeys the same rules as ordinary, real, numbers. His calcu- 
lation was, in effect, an experimental test of the proposition that complex 
numbers form a field—a proposition that could not have been formulated, 
let alone proved, at the time. The first rigorous treatment of complex num- 
bers was that of Hamilton, who in 1835 gave definitions of complex num- 
bers, addition, and multiplication that make a proof of the field properties 
crystal clear. 

Hamilton defined complex numbers as ordered pairs z = (a,b) of real 
numbers, and he defined their sum and product by 


(a1,b1) + (a2,b2) = (ay + a,b; +b), 
(a1 ,b1)(az,b2) = (ayaz — bj bz, a,b2 + bya2). 


Of course, these definitions are motivated by the interpretation of (a,b) as 
a+ ib, where i* = —1, but the important point is that the field properties 
follow from these definitions and the properties of real numbers. The prop- 
erties of addition are directly “inherited” from properties of real number 
addition. For example, for complex numbers z; = (a;,b;) and z2 = (a2,b2) 
we have 

Z+2=2421 


because 
a, +da.=a.+a, and b} + by =b2+b, for real numbers aj,a2,b1,b2. 


Indeed, the properties of addition are not special properties of pairs, they 
also hold for the vector sum of triples, quadruples, and so on. The field 
properties of multiplication, on the other hand, depend on the curious defi- 
nition of product of pairs, which has no obvious generalization to a product 
of n-tuples for n > 2. 

This raises the question; is it possible to define a “product” operation on 
IR" that, together with the vector sum operation, makes R” a field? Hamil- 
ton hoped to find such a product for each n. Indeed, he hoped to find a 
product with not only the field properties but also the multiplicative abso- 
lute value 

[uv] =|ul|vI, 
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where the absolute value of u = (x1,02,-.-,Xn) is |ul = 4/x7 +25 +-+- +22. 
As we have seen, for n = 2 this property is equivalent to the Diophantus 
identity for sums of two squares, so a multiplicative absolute value in gen- 
eral implies an identity for sums of n squares. 

Hamilton attacked the problem from the opposite direction, as it were. 
He tried to define the product operation, first for triples, before worrying 
about the absolute value. But after searching fruitlessly for 13 years, he 
had to admit defeat. He still had not noticed that there is no three square 
identity, but he suspected that multiplying triples of the form a+ bi+ cj 
requires a new object k = ij. Also, he began to realize that there is no hope 
for the commutative law of multiplication. Desperate to salvage something 
from his 13 years of work, he made the leap to the fourth dimension. He 
took k = ij to be a vector perpendicular to 1, i, and j, and sacrificed the 
commutative law by allowing ij = —ji, jk = —kj, and ki = —ik. On Octo- 
ber 16, 1843 he had his famous epiphany that i, j, and k must satisfy 


P= pPa=k =ijk=-1. 


As we have seen in Section 1.3, these relations imply all the field prop- 
erties, except commutative multiplication. Such a system is often called 
a skew field (though this term unfortunately suggests a specialization of 
the field concept, rather than what it really is—a generalization). Hamil- 
ton’s relations also imply that absolute value is multiplicative—a fact he 
had to check, though the equivalent four-square identity was well known 
to number theorists. 

In 1878, Frobenius proved that the quaternion algebra HI is the only 
skew field R” that is not a field, so Hamilton had found the only “algebra 
of n-tuples” it was possible to find under the conditions he had imposed. 

The multiplicative absolute value, as stressed in Section 1.4, implies 
that multiplication by a quaternion of absolute value 1 is an isometry of 
R*. Hamilton seems to have overlooked this important geometric fact, and 
the quaternion representation of space rotations (Section 1.5) was first pub- 
lished by Cayley in 1845. Cayley also noticed that the corresponding for- 
mulas for transforming the coordinates of R? had been given by Rodrigues 
in 1840. Cayley’s discovery showed that the noncommutative quaternion 
product is a good thing, because space rotations are certainly noncommu- 
tative; hence they can be faithfully represented only by a noncommutative 
algebra. This finding has been enthusiastically endorsed by the computer 
graphics profession today, which uses quaternions as a standard tool for 
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rendering 3-dimensional motion. 

The quaternion algebra H plays two roles in Lie theory. On the one 
hand, H gives the most understandable treatment of rotations in R? and R*, 
and hence of the rotation groups of these two spaces. The rotation groups 
of R? and R* are Lie groups, and they illustrate many general features of 
Lie theory in a way that is easy to visualize and compute. On the other 
hand, H also provides coordinates for an infinite series of spaces HI”, with 
properties closely analogous to those of the spaces IR” and C”. In particular, 
we can generalize the concept of “rotation group” from R” to both C” and 
HH” (see Chapter 3). It turns out that almost all Lie groups and Lie algebras 
can be associated with the spaces IR”, C”, or Hi”, and these are the spaces 
we are concerned with in this book. 

However, we cannot fail to mention what falls outside our scope: the 
8-dimensional algebra O of octonions. Octonions were discovered by a 
friend of Hamilton, John Graves, in December 1843. Graves noticed that 
the algebra of quaternions could be derived from Euler’s four-square iden- 
tity, and he realized that an eight-square identity would similarly yield a 
“product” of octuples with multiplicative absolute value. An eight-square 
identity had in fact been published by the Danish mathematician Degen in 
1818, but Graves did not know this. Instead, Graves discovered the eight- 
square identity himself, and with it the algebra of octonions. The octonion 
sum, as usual, is the vector sum, and the octonion product is not only non- 
commutative but also nonassociative. That is, it is not generally the case 
that u(vw) = (uv)w. 

The nonassociative octonion product causes trouble both algebraically 
and geometrically. On the algebraic side, one cannot represent octonions 
by matrices, because the matrix product is associative. On the geometric 
side, an octonion projective space (of more than two dimensions) is im- 
possible, because of a theorem of Hilbert from 1899. Hilbert’s theorem 
essentially states that the coordinates of a projective space satisfy the asso- 
ciative law of multiplication (see Hilbert [1971]). One therefore has only 
O itself, and the octonion projective plane, OP”, to work with. Because of 
this, there are few important Lie groups associated with the octonions. But 
these are a very select few! They are called the exceptional Lie groups, and 
they are among the most interesting objects in mathematics. Unfortunately, 
they are beyond the scope of this book, so we can mention them only in 
passing. 
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Groups 


PREVIEW 


This chapter begins by reviewing some basic group theory—subgroups, 
quotients, homomorphisms, and isomorphisms—in order to have a basis 
for discussing Lie groups in general and simple Lie groups in particular. 

We revisit the group S? of unit quaternions, this time viewing its rela- 
tion to the group SO(3) as a 2-to-1 homomorphism. It follows that S° is 
not a simple group. On the other hand, SO(3) is simple, as we show by a 
direct geometric proof. 

This discovery motivates much of Lie theory. There are infinitely many 
simple Lie groups, and most of them are generalizations of rotation groups 
in some sense. However, deep ideas are involved in identifying the simple 
groups and in showing that we have enumerated them all. 

To show why it is not easy to identify all the simple Lie groups we 
make a special study of SO(4), the rotation group of R*. Like SO(3), 
SO(4) can be described with the help of quaternions. But a rotation of 
R* generally depends on two quaternions, and this gives SO(4) a special 
structure, related to the direct product of S? with itself. In particular, it 
follows that SO(4) is not simple. 
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2.1 Crash course on groups 


For readers who would like a reminder of the basic properties of groups, 
here is a crash course, oriented toward the kind of groups studied in this 
book. Even those who have not seen groups before will be familiar with the 
computational tricks—such as canceling by multiplying by the inverse— 
since they are the same as those used in matrix computations. 

First, a group G is a set with “product” and “inverse” operations, and 
an identity element 1, with the following three basic properties: 


81(8283) = (g182)83 for all g1,22,83 € G, 
gl=lg=g for all g € G, 
gg t=g!g=1 for all g € G. 


It should be mentioned that 1 is the unique element g’ such that gg’ = g 
for all g € G, because multiplying the equation gg’ = g on the left by g! 
gives g’ = 1. Similarly, for each g € G, g~! is the unique element g” such 
that ge” = 1. 

The above notation for “product,” “inverse,” and “identity” is called 
multiplicative notation. It is used (sometimes with J, e, or 1 in place of 1) 
for groups of numbers, quaternions, matrices, and all other groups whose 
operation is called “product.” There are a few groups whose operation is 
called “sum,” such as R” under vector addition. For these we use additive 
notation: g, + g2 for the “sum” of g;,g2 € G, —g for the inverse of g € G, 
and 0 (or 0) for the identity of G. Additive notation is used only when G is 
abelian, that is, when gj + g2 = g2+ g forall g1,g0 € G. 

Since groups are generally not abelian, we have to speak of multiplying 
h by g “on the left” or “on the right,’ because gh and hg are generally 
different. If we multiply a// members g’ of a group G on the left by a 
particular g € G, we get back all the members of G, because for any g” € G 
there is a g’ € Gsuch that ge’ = g” (namely g’ = gg”). 


Subgroups and cosets 


To study a group G we look at the groups H contained in it, the subgroups 
of G. For each subgroup H of G we have a decomposition of G into disjoint 
pieces called the (left or right) cosets of H in G. The left cosets (which we 
stick with, for the sake of consistency) are the sets of the form 


gH ={gh:heH}. 
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Thus H itself is the coset for g = 1, and in general a coset gH is “H trans- 
lated by g,” though one cannot usually take the word “translation” literally. 
One example for which this is literally true is G the plane R* of points 
(x,y) under vector addition, and H the subgroup of points (0,y). In this 
case we use additive notation and write the coset of (x,y) as 


(x,y) +H ={(x,y):y eR}, where x is constant. 


Then H is the y-axis and the coset (x,y) +H is H translated by the vector 
(x,y) (see Figure 2.1). This example also illustrates how a group G decom- 


H (1,0) +H (2,0) +H 


>X 


0 (1,0) (2,0) 


Figure 2.1: Subgroup H of R? and cosets. 


poses into disjoint cosets (decomposing the plane into parallel lines), and 
that different g € G can give the same coset gH. For example, (1,0) +H 
and (1,1) +H are both the vertical line x = 1. 

Each coset gH is in 1-to-1 correspondence with H because we get back 
each h € H from gh € gH by multiplying on the left by g-!. Different 
cosets are disjoint because if g € g,H and g € goH then 


g=g2ih,=goh. forsome hy,h2 cH, 
and therefore g; = gohzh,'. But then 

81H = golyhy'H = go(hoh,'H) = gH 
because hyhy' © H and therefore Moh, 'H = H by the remark at the end of 
the last subsection (that multiplying a group by one of its members gives 


back the group). Thus if two cosets have an element in common, they are 
identical. 
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This algebraic argument has surprising geometric consequences; for 
example, a filling of S* by disjoint circles known as the Hopf fibration. 
Figure 2.2 shows some of the circles, projected stereographically into R?. 
The circles fill nested torus surfaces, one of which is shown in gray. 


Figure 2.2: Some circles in the Hopf fibration. 


Proposition: S* can be decomposed into disjoint congruent circles. 


Proof. As we saw in Section 1.3, the quaternions a+ bi+ cj + dk of unit 
length satisfy 
a+bh+c4+d=1, 


and hence they form a 3-sphere S*. The unit quaternions also form a group 
G, because the product and inverse of unit quaternions are also unit quater- 
nions, by the multiplicative property of absolute value. 

One subgroup H of G consists of the unit quaternions of the form 
cos 8 +isin@, and these form a unit circle in the plane spanned by | and 
i. It follows that any coset gH is also a unit circle, because multiplica- 
tion by a quaternion q of unit length is an isometry, as we saw in Section 
1.4. Since the cosets gH fill the whole group and are disjoint, we have a 
decomposition of the 3-sphere into unit circles. O 
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Exercises 


An important nonabelian group (in fact, it is the simplest example of a nonabelian 
Lie group) is the group of functions of the form 


fap(x) =ax+b, wherea,b€ Randa>0. 


The group operation is function composition. 


2.1.1 If fao(*) = fayby(fay,o,(x)), work out a,b in terms of a1,b),a2,b2, and 
check that they are the same as the a,b determined by 


(0 i)=(¢ 2) 4): 


2.1.2 Also show that the inverse function = (x) exists, and that it corresponds to 


the inverse matrix 
a b\ 
0 1 , 


This correspondence between functions and matrices is a matrix representation of 
the group of functions f,,. We have already seen examples of matrix representa- 
tions of groups—such as the rotation groups in two and three dimensions—and, 
in fact, most of the important Lie groups can be represented by matrices. 

The unit complex numbers, cos 6 + isin @, form a group SO(2) that we began 
to study in Section 1.1. We now investigate its subgroups. 


2.1.3 Other than the trivial group {1}, what is the smallest subgroup of SO(2)? 


2.1.4 Show that there is exactly one n-element subgroup of SO(2), for each natu- 
ral number n, and list its members. 


2.1.5 Show that the union R of all the finite subgroups of SO(2) is also a subgroup 
(the group of “rational rotations”). 


2.1.6 If z is a complex number not in the group R described in Exercise 2.1.5, 
show that the numbers ...,z~?,z~!,1,z,z”,... are all distinct, and that they 
form a subgroup of SO(2). 

2.2 Crash course on homomorphisms 

Normal subgroups 


Since hg gh in general, it can also be that gH 4 Hg, where 


Hg={hg:he H} 
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is the right coset of H. If gH = Hg for all g € G, we say that H is anormal 
subgroup of G. An equivalent statement is that H equals 


g 'Hg={g'hg:heH} foreachg eG. 


(Because of this, it would be more sensible to call H “self-conjugate,” but 
unfortunately the overused word “normal” has stuck.) 

The good thing about a normal subgroup 4 is that its cosets themselves 
form a group when “multiplied” by the rule that “the coset of g1, times the 
coset of g2, equals the coset of g1g2”: 


81H: 90H = g1g2H. 


This rule makes sense for a normal subgroup H because if gH = gH and 
95H = goH then g'g5H = g1 90H as follows: 


2'.25H = g\Hg5 since g5H = Hg’ by normality, 
= 9\Hg5_ since gH = g)H by assumption, 
=2195H since gH = Hg» by normality, 
= 2120H since gH = goH by assumption. 


The group of cosets is called the quotient group of G by H, and is 
written G/H. (When G and H are finite, the size of G/H is indeed the size 
of G divided by the size of H.) We reiterate that the quotient group G/H 
exists only when H is a normal subgroup. Another, more efficient, way to 
describe this situation is in terms of homomorphisms: structure-preserving 
maps from one group to another. 


Homomorphisms and isomorphisms 


When H is a normal subgroup of G, the map @ : G > G/H defined by 
0(g)=gH forallgeG 

preserves products in the sense that 
(8182) = (81) (82): 

This follows immediately from the definition of product of cosets, because 


(2182) = 8182H = g1H- g2H = 9(g1)- (82). 
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In general, a map @ : G > G’ of one group into another is called a ho- 
momorphism (from the Greek for “similar form’’) if it preserves products. 
A group homomorphism indeed preserves group structure, because it not 
only preserves products, but also the identity and inverses. Here is why: 


e Since g = lg for any g € G, we have 


0(g) = O(1g) = —(1) —(g) because @ preserves products. 
Multiplying both sides on the right by g(g)~! then gives 1 = @(1). 


e Since 1 = gg! for any g € G, we have 


1 = (1) = e(gg') = 9(8)9(87") 
because @ preserves products. 


This says that p(g~'!) = @(g)~!, because the inverse of @(g) is 
unique. 


Thus the image @(G) is of “similar” form to G, but we say that G’ is 
isomorphic (of the “same form’) to G only when the map @ is 1-to-1 and 
onto (in which case we call @ an isomorphism). In general, @(G) is only a 
shadow of G, because many elements of G may map to the same element 
of G’. The case furthest from isomorphism is that in which @ sends all 
elements of G to 1. 

Any homomorphism @ of G onto G’ can be viewed as the special type 
@:G-—>G/H. The appropriate normal subgroup H of G is the so-called 
kernel of @: 


H =ker p= {g €G: 9(g) = 1}. 


Then G’ is isomorphic to the group G/ker @ of cosets of ker @ because: 
1. ker @ is a group, because 


hy, ho € ker 9 > (h1) = O(h2) = 1 
= 9(h1)p(h2) =1 
= p(hyhz) =1 
= hyhy € ker - 
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and 


hekerg => o(h) =1 


= o(h)'=1 
= o(h')= 
=>h' ckero 


. ker @ is anormal subgroup of G, because, for any g € G, 


he ker p = 9(ghg-') = 9(s)@(h)o(g ') = 9(g)19(g)' = 1 
=> ghg! €ker @. 
Hence g(ker ~)g~! = ker Q, that is, ker @ is normal. 


. Each g’ = @(g) € G’ corresponds to the coset g(ker @). 


In fact, g(ker p) = p~!(g’), because 

keg !(g') <= @(k)=8' (definition of p~') 
= O(k) = 9(g) 
= (sg) 'o(k) =1 
© p(g-'k)=1 
& g 'keker@ 
=k € g(ker @). 


. Products of elements of g',95 € G’ correspond to products of the 


corresponding cosets: 
81 =0(81),82=9(82) > GP '(81)=81 (ker 9), P”'(g) =g2(ker @) 
by step 3. But also 
81 = P(81),82 = P(82) + 8189 = (81) (82) = P(8182) 
= —"(8185) = 8182(ker @), 


also by step 3. Thus the product g’g/ corresponds to g1g2(ker @), 
which is the product of the cosets corresponding to g’, and g/, respec- 
tively. 


To sum up: a group homomorphism @ of G onto G' gives a I-to-1 corre- 
spondence between G' and G/(ker @) that preserves products, that is, G’ 
is isomorphic to G/(ker @). 


This result is called the fundamental homomorphism theorem for 


groups. 
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The det homomorphism 


An important homomorphism for real and complex matrix groups G is the 
determinant map 
det:G—C%, 


where C* denotes the multiplicative group of nonzero complex numbers. 
The determinant map is a homomorphism because det is multiplicative— 
det(AB) = det(A) det(B)—a fact well known from linear algebra. 

The kernel of det, consisting of the matrices with determinant 1, is 
therefore a normal subgroup of G. Many important Lie groups arise in 
precisely this way, as we will see in Chapter 3. 


Simple groups 


A many-to-1 homomorphism of a group G maps it onto a group G’ that 
is “simpler” than G (or, at any rate, not more complicated than G). For 
this reason, groups that admit no such homomorphism, other than the ho- 
momorphism sending all elements to 1, are called simple. Equivalently, a 
nontrivial group is simple if it contains no normal subgroups other than 
itself and the trivial group. 

One of the main goals of group theory in general, and Lie group theory 
in particular, is to find all the simple groups. We find the first interesting 
example in the next section. 


Exercises 


2.2.1 Check that z+ z* is a homomorphism of S!. What is its kernel? What are 
the cosets of the kernel? 


2.2.2 Show directly (that is, without appealing to Exercise 2.2.1) that pairs {+zq}, 
where zq = cosa + isin a, form a group G when pairs are multiplied by the 
rule 


{tza}-{ zp}={ (zazZp)}- 


Show also that the function g : S' — G that sends both zg, —za, € S! to the 
pair {+z} is a 2-to-1 homomorphism. 


2.2.3 Show that z+> 2 is a well-defined map from G onto S!, where G is the 
group described in Exercise 2.2.2, and that this map is an isomorphism. 


The space that consists of the pairs {+z} of opposite (or “antipodal’’) points 
on the circle is called the real projective line RP'. Thus the above exercises 
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show that the real projective line has a natural group structure, under which it is 
isomorphic to the circle group S!. 

In the next section we will consider the real projective space RP®, consisting 
of the antipodal point pairs {+g} on the 3-sphere S*. These pairs likewise have 
a natural product operation, which makes RP? a group—in fact, it is the group 
SO(3) of rotations of R?. We will show that RP* is not the same group as S*, 
because SO(3) is simple and S? is not. 

We can see right now that S? is not simple, by finding a nontrivial normal 
subgroup. 


2.2.4 Show that {+1} is a normal subgroup of S?. 


However, it turns out that {+1} is the only nontrivial normal subgroup of S°. 
In particular, the subgroup S! that we found in Section 2.1 is not normal. 


2.2.5 Show that S! is not a normal subgroup of S?. 


2.3 The groups SU(2) and SO(3) 


The group SO(2) of rotations of R* about O can be viewed as a geometric 
object, namely the unit circle in the plane, as we observed in Section 1.1. 

The unit circle, S!, is the first in the series of unit n-spheres S", the nth 
of which consists of the points at distance 1 from the origin in R’*!'. Thus 
S? is the ordinary sphere, consisting of the points at distance 1 from the 
origin in R*. Unfortunately (for those who would like an example of an 
easily visualized but nontrivial Lie group) there is no rule for multiplying 
points that makes S? a Lie group. In fact, the only other Lie group among 
the n-spheres is S*. As we saw in Section 1.3, it becomes a group when 
its points are viewed as unit quaternions, under the operation of quaternion 
multiplication. 

The group S* of unit quaternions can also be viewed as the group of 
2 x 2 complex matrices of the form 


_ fara -—b—e _ 
Ge a where det(Q) = 1, 


because these are precisely the quaternions of absolute value 1. Such matri- 
ces are called unitary, and the group S? is also known as the special unitary 
group SU(2). Unitary matrices are the complex counterpart of orthogonal 
matrices, and we study the analogy between the two in Chapters 3 and 4. 
The group SU(2) is closely related to the group SO(3) of rotations 
of R?. As we saw in Section 1.5, rotations of R? correspond 1-to-1 to 
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the pairs +t of antipodal unit quaternions, the rotation being induced on 
Ri+ Rj+Rk by the conjugation map g+> t~!gt. Also, the group operation 
of SO(3) corresponds to quaternion multiplication, because if one rotation 
is induced by conjugation by ¢, and another by conjugation by f, then 
conjugation by f;f2 induces the product rotation (first rotation followed by 
the second). Of course, we multiply pairs +t of quaternions by the rule 


( t1)( to) = tho. 


We therefore identify SO(3) with the group RP° of unit quaternion 
pairs +t under this product operation. The map @ : SU(2) — SO(3) defined 
by @(t) = {+1} is a 2-to-1 homomorphism, because the two elements ¢ and 
—t of SU(2) go to the single pair +r in SO(3). Thus SO(3) looks “simpler” 
than SU(2) because SO(3) has only one element where SU(2) has two. 
Indeed, SO(3) is “simpler” because SU(2) is not simple—it has the normal 
subgroup {+1}—and SO(3) is. We now prove this famous property of 
SO(3) by showing that SO(3) has no nontrivial normal subgroup. 


Simplicity of SO(3). The only nontrivial subgroup of SO(3) closed under 
conjugation is SO(3) itself. 


Proof. Suppose that H is a nontrivial subgroup of SO(3), so H includes a 
nontrivial rotation, say the rotation / about axis / through angle a. 

Now suppose that H is normal, so H also includes all elements g~'hg 
for g € SO(3). If g moves axis / to axis m, then g~'hg is the rotation about 
axis m through angle a. (In detail, g~! moves m to 1, h rotates through 
angle @ about /, then g moves / back to m.) Thus the normal subgroup H 
includes the rotations through angle o about all possible axes. 

Now a rotation through @ about P, followed by rotation through @ 
about Q, equals rotation through angle @ about R, where R and @ are as 
shown in Figure 2.3. As in Exercise 1.5.6, we obtain the rotation about 
P by successive reflections in the great circles PR and PQ, and then the 
rotation about Q by successive reflections in the great circles PQ and QR. 
In this sequence of four reflections, the reflections in PQ cancel out, leaving 
the reflections in PR and QR that define the rotation about R. 

As P varies continuously over some interval of the great circle through 
P and Q, @ varies continuously over some interval. (R may also vary, but 
this does not matter.) It follows that @ takes some value of the form 


mr ; 
—, where mis odd, 
n 
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Figure 2.3: Angle of the product rotation. 


because such numbers are dense in R. The n-fold product of this rotation 
also belongs to H, and it is a rotation about R through mz, where m is odd. 
The latter rotation is simply rotation through 7, so H includes rotations 
through z about any point on the sphere (by conjugation with a suitable g 
again). 

Finally, taking the product of rotations with o/2 = 2/2 in Figure 2.3, 
it is clear that we can get a rotation about R through any angle 6 between 
0 and 27. Hence H includes all the rotations in SO(3). 


Exercises 


Like SO(2), SO(3) contains some finite subgroups. It contains all the finite sub- 
groups of SO(2) in an obvious way (as rotations of R? about a fixed axis), but 
also three more interesting subgroups called the polyhedral groups. Each poly- 
hedral group is so called because it consists of the rotations that map a regular 
polyhedron into itself. 

Here we consider the group of 12 rotations that map a regular tetrahedron 
into itself. We consider the tetrahedron whose vertices are alternate vertices of the 
unit cube in Ri+ Rj-+ Rk, where the cube has center at O and edges parallel to 
the i, j, and k axes (Figure 2.4). 

First, let us see why there are indeed 12 rotations that map the tetrahedron 
into itself. To do this, observe that the position of the tetrahedron is completely 
determined when we know 


e Which of the four faces is in the position of the front face in Figure 2.4. 
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e Which of the three edges of that face is at the bottom of the front face in 
Figure 2.4. 


Figure 2.4: The tetrahedron and the cube. 


2.3.1 Explain why this observation implies 12 possible positions of the tetrahe- 
dron, and also explain why all these positions can be obtained by rotations. 


2.3.2 Similarly, explain why there are 24 rotations that map the cube into itself 
(so the rotation group of the tetrahedron is different from the rotation group 
of the cube). 


The 12 rotations of the tetrahedron are in fact easy to enumerate with the help 
of Figure 2.5. As is clear from the figure, the tetrahedron is mapped into itself by 
two types of rotation: 


e A 1/2 turn about each line through the centers of opposite edges. 


e A 1/3 turn about each line through a vertex and the opposite face center. 


2.3.3 Show that there are 11 distinct rotations among these two types. What 
rotation accounts for the 12th position of the tetrahedron? 


Now we make use of the quaternion representation of rotations from Section 
1.5. Remember that a rotation about axis u through angle @ corresponds to the 
quaternion pair +g, where 


cos + usin 2 

=cos—+usin~. 

. 2 2 

2.3.4 Show that the identity, and the three 1/2 turns, correspond to the four quater- 
nion pairs +1, +i, +j, +k. 


2 Ff 
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CLD 1/2 turn 


1/3 turn 


ty 


Figure 2.5: The tetrahedron and axes of rotation. 


2.3.5 Show that the 1/3 turns correspond to the eight antipodal pairs among the 
16 quaternions 


Ni- 
rice 
Nie 


1 
2 
The 24 quaternions obtained in Exercises 2.3.4 and 2.3.5 form an exceptionally 


symmetric configuration in R*. They are the vertices of a regular figure called the 
24-cell, copies of which form a “tiling” of R*. 


2.4 Isometries of IR” and reflections 


In this section we take up an idea that appeared briefly in the exercises 
for Section 1.5: the representation of isometries as products of reflections. 
There we showed that certain isometries of R* and R* are products of 
reflections. Here we represent isometries of IR” as products of reflections, 
and in the next section we use this result to describe the rotations of R*. 

We actually prove that any isometry of R" that fixes O is the product 
of reflections in hyperplanes through O, and then specialize to orientation- 
preserving isometries. A hyperplane H through O is an (n— 1)-dimensional 
subspace of IR”, and reflection in H is the linear map of IR” that fixes the 
elements in H and reverses the vectors orthogonal to H. 
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Reflection representation of isometries. Any isometry of R” that fixes O 
is the product of at most n reflections in hyperplanes through O. 


Proof. We argue by induction on n. For n = | the result is the obvious one 
that the only isometries of R fixing O are the identity and the map x > —x, 
which is reflection in O. 

Now suppose that the result is true for n = k— 1 and that f is an isom- 
etry of R* fixing O. If f is not the identity, suppose that v € R* is such 
that f(v) =w #v. Then the reflection r, in the hyperplane orthogonal to 
u = v—w maps the subspace Ru of real multiples of u onto itself and the 
map r,f (“f followed by r,,”) is the identity on the subspace Ru of R*. 

The restriction of r,f to the R*! orthogonal to Ru is, by induction, 
the product of < k—1 reflections. It follows that f = r,g, where g is the 
product of < k—1 reflections. 

Therefore, f is the product of < k reflections, and the result is true for 
all n by induction. 


It follows in particular that any orientation-preserving isometry of R* 
is the product of 0 or 2 reflections (because the product of an odd number 
of reflections reverses orientation). Thus any such isometry is a rotation 
about an axis passing through O. 

This theorem is sometimes known as the Cartan—Dieudonné theorem, 
after a more general theorem proved by Cartan [1938], and generalized 
further by Dieudonné. Cartan’s theorem concerns “reflections” in spaces 
with real or complex coordinates, and Dieudonné’s extends it to spaces 
with coordinates from finite fields. 


Exercises 


Assuming that reflections are linear, the representation of isometries as products 
of reflections shows that all isometries fixing the origin are linear maps. In fact, 
there is nice direct proof that all such isometries (including reflections) are linear, 
pointed out to me by Marc Ryser. We suppose that f is an isometry that fixes O, 
and that wu and v are any points in R”. 


2.4.1 Prove that f preserves straight lines and midpoints of line segments. 


2.4.2 Using the fact that u+ v is the midpoint of the line joining 2u and 2v, and 
Exercise 2.4.1, show that f(u+v) = f(u) + f(v). 


2.4.3 Also prove that f(ru) = rf(u) for any real number r. 


It is also true that reflections have determinant —1, hence the determinant detects 
the “reversal of orientation” effected by a reflection. 
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2.4.4 Show that reflection in the hyperplane orthogonal to a coordinate axis has 
determinant —1, and generalize this result to any reflection. 


2.5 Rotations of R* and pairs of quaternions 


A linear map is called orientation-preserving if its determinant is positive, 
and orientation-reversing otherwise. Reflections are linear and orientation- 
reversing, so a product of reflections is orientation-preserving if and only 
if it contains an even number of terms. We define a rotation of RR" about O 
to be an orientation-preserving isometry that fixes O. 

Thus it follows from the Cartan—Dieudonné theorem that any rotation 
of R* is the product of 0, 2, or 4 reflections. The exact number is not impor- 
tant here—what we really want is a way to represent reflections by quater- 
nions, as a stepping-stone to the representation of rotations by quaternions. 
Not surprisingly, each reflection is specified by the quaternion orthogonal 
to the hyperplane of reflection. More surprisingly, a rotation is specified 
by just two quaternions, regardless of the number of reflections needed to 
compose it. Our proof follows Conway and Smith [2003], p. 41. 


Quaternion representation of reflections. Reflection of H = R* in the 
hyperplane through O orthogonal to the unit quaternion u is the map that 
sends each q € Hl to —uqu. 


Proof. First observe that the map g++ —uqu is an isometry. This is because 


e gt —@ reverses the real part of g and keeps the imaginary part fixed, 
hence it is reflection in the hyperplane spanned by i, j, and k. 


e Multiplication on the left by the unit quaternion u is an isometry 
by the argument in Section 1.4, and there is a similar argument for 
multiplication on the right. 


Next notice that the map g++ —uqu sends 


vu to —u(vu)u=—uuvu because (vu) =7V, 
1. 


= —Yu because ua = |u|? = 


In particular, the map sends u to —u, so vectors parallel to u are reversed. 
And it sends iu to iu, because i = —i, and similarly ju to ju and ku to ku. 
Thus the vectors iu, ju, and ku, which span the hyperplane orthogonal to 
u, are fixed. Hence the map g +~ —uqu is reflection in this hyperplane. 
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Quaternion representation of rotations. Any rotation of H = R* about 
O is a map of the form q+ vqw, where v and w are unit quaternions. 


Proof. It follows from the quaternion representation of reflections that the 
result of successive reflections in the hyperplanes orthogonal to the unit 
quaternions u,U2,...,U2, is the map 


d'? Un ++ U3U2QU] G U{U2U3 +++ U2, 


because an even number of sign changes and conjugations makes no 
change. The pre- and postmultipliers are in general two different unit 
quaternions, uo, ---W3u2M, = v and Wyu2U3 --- U2, = w, say, so the general 
rotation of R* is a map of the form 


qt vqw, where v and w are unit quaternions. 


Conversely, any map of this form is a rotation, because multiplication 
of Hi = R4 on either side by a unit quaternion is an orientation-preserving 
isometry. We already know that multiplication by a unit quaternion is an 
isometry, by Section 1.4. And it preserves orientation by the following 
argument. 

Multiplication of Hi = R* by a unit quaternion 


ae —b—ic 
v= 


; . , where a?+b*+e° +d =1, 
b—ic a-—id 


is a linear transformation of R* with matrix 


where the 2 x 2 submatrices represent the complex-number entries in v. It 
can be checked that det(R,) = 1. So multiplication by v, on either side, 
preserves orientation. 


Exercises 


The following exercises study the rotation q+ ig of H = R’, first expressing it as a 
product of “plane rotations” —of the planes spanned by 1, i and j, k respectively— 
then breaking it down to a product of four reflections. 
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2.5.1 Check that g+> ig sends 1 toi, ito —1 and j tok, k to —j. How many points 
of R* are fixed by this map? 


2.5.2 Show that the rotation that sends | to i, i to —1 and leaves j, k fixed is 
the product of reflections in the hyperplanes orthogonal to uw; =i and u2 = 


(i-1)/v2. 


2.5.3 Show that the rotation that sends j to k, k to —j and leaves 1, i fixed is the 
product of reflections in the hyperplanes orthogonal to uz = k and u4 = 


(k — j)/Vv2. 


It follows, by the formula g +> —uqu for reflection, that the product of rota- 
tions in Exercises 2.5.2 and 2.5.3 is the product 


q'> Ugl3U2uy] gq Uy U2u3U4 
of reflections in the hyperplanes orthogonal to 1, u2,u3,u4 respectively. 


2.5.4 Check that wam3u2m7 = i and Hju2%3u4 = 1, so the product of the four re- 
flections is indeed q+ ig. 


2.6 Direct products of groups 


Before we analyze rotations of R* from the viewpoint of group theory, it is 
desirable to review the concept of direct product or Cartesian product of 
groups. 


Definition. If A and B are groups then their direct product A x B is the set 
of ordered pairs (a,b), where a € A and b € B, under the “product of pairs” 
operation defined by 


(a1,b1)(a2,b2) = (ajaz,b,b2). 


It is easy to check that this product operation is associative, that the 
identity element of A x B is the pair (14,18), where 1, is the identity of 
A and 1g is the identity of B, and that (a,b) has inverse (a~',b~'). Thus 
A x Bis indeed a group. 

Many important groups are nontrivial direct products; that is, they have 
the form A x B where neither A nor B is the trivial group {1}. For example: 


e The group R’, under vector addition, is the direct product R x R. 
More generally, IR” is the n-fold direct product Rx R x--- xR. 
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e If A and B are groups of n x n matrices, then the matrices of the form 


a 0 
¢ :) , Where a€A and DEB, 
make up a group isomorphic to A x B under matrix multiplication, 
where 0 is the n x n zero matrix. This is because of the so-called 
block multiplication of matrices, according to which 


a| 0 and 0 a a\a2 0 
0 b)\0 bb) \ 0 bib)’ 
e It follows, from the previous item, that R” is isomorphic to a 2n x 2n 
matrix group, because R is isomorphic to the group of matrices 


1 x 
(( ') where xER. 


e The group S! x S! is a group called the (two-dimensional) torus T?. 
More generally, the n-fold direct product of S! factors is called the 
n-dimensional torus T”. 


We call S! x S! a torus because its elements (0,@), where 0,@ € S!, 
can be viewed as the points on the torus surface (Figure 2.6). 


Figure 2.6: The torus S! x S!. 


Since the groups R and S! are abelian, the same is true of all their 
direct products R” x T”. It can be shown that the latter groups include all 
the connected abelian matrix Lie groups. 
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Exercises 


If we let x1,x2,x3,x4 be the coordinates along mutually orthogonal axes in R*, 
then it is possible to “rotate” the x; and x2 axes while keeping the x3 and x4 axes 
fixed. 


2.6.1 Write a 4 x 4 matrix for the transformation that rotates the (x; ,x2)-plane 
through angle @ while keeping the x3- and x4-axes fixed. 


2.6.2 Write a 4 x 4 matrix for the transformation that rotates the (x3,x4)-plane 
through angle @ while keeping the x;- and x2-axes fixed. 


2.6.3 Observe that the rotations in Exercise 2.6.1 form an S!, as do the rotations 
in Exercise 2.6.2, and deduce that SO(4) contains a subgroup isomorphic 
to T?. 


The groups of the form R” x T” may be called “generalized cylinders,” based 
on the simplest example R x S!. 


2.6.4 Why is it appropriate to call the group R x S! a cylinder? 


The notation S” is unfortunately not compatible with the direct product nota- 
tion (at least not the way the notation R” is). 


2.6.5 Explain why S* = SU(2) is not the same group as S! x S! x S!. 


2.7 The map from SU(2) xSU(2) to SO(4) 


In Section 2.5 we showed that the rotations of R* are precisely the maps 
q+ vqw, where v and w run through all the unit quaternions. Since v~! 
is a unit quaternion if and only if v is, it is equally valid to represent each 
rotation of R* by a map of the form g+> v—!qw, where v and w are unit 
quaternions. The latter representation is more convenient for what comes 
next. 

The pairs of unit quaternions (v,w) form a group under the operation 
defined by 


(v1,W1) ; (v2,W2) = (v1¥2,W1W2), 


where the products v; vz and w;w2 on the right side are ordinary quaternion 
products. Since the v come from the group SU(2) of unit quaternions, and 
the w likewise, the group of pairs (v,w) is the direct product SU(2) x SU(2) 
of SU(2) with itself. 

The map that sends each pair (v,w) € SU(2) x SU(2) to the rotation 
q+ v_!qw in SO(4) is a homomorphism @ : SU(2) x SU(2) — SO(4). 
This is because 
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the product of the map g++ v;'qwi corresponding to (v,w1) 
e with the map g++ v, ‘gw corresponding to (v2, 2) 


e is the map gt v5 Vv, qwiwo, 


which is the map q+ (v1v2)~!g(w1w2) corresponding to the product 
(v1v2,W1W2) of (v1,W1) and (v2,W2). 


This homomorphism is onto SO(4), because each rotation of R* can 
be expressed in the form g++ v~!qw, but one might expect it to be very 
many-to-one, since many pairs (v,w) of unit quaternions conceivably give 
the same rotation. Surprisingly, this is not so. The representation of ro- 
tations by pairs is “unique up to sign” in the following sense: if (v,w) 
gives a certain rotation, the only other pair that gives the same rotation is 
(—v, —w). 

To prove this, it suffices to prove that the kernel of the homomorphism 
@ : SU(2) x SU(2) — SO(4) has two elements. 


Size of the kernel. The homomorphism @ : SU(2) x SU(2) — SO(A4) is 
2-to-1, because its kernel has two elements. 


Proof. Suppose that (v,w) is in the kernel, so g +> v_'qw is the identity 
rotation. In particular, this rotation fixes 1, so 


v 'lw=1; hence v=w. 


Thus the map is in fact g++ v~!qv, which we know (from Section 1.5) fixes 
the real axis and rotates the space of pure imaginary quaternions. Only if 
v = 1 or v=~—1 does the map fix everything; hence the kernel of @ has 
only two elements, (1,1) and (—1,—1). 

The left cosets of the kernel are therefore the 2-element sets 


(v,w)(+1,+1) = (4v,+w), 


and each coset corresponds to a distinct rotation of R*, by the fundamental 
homomorphism theorem of Section 2.2. 


This theorem shows that SO(4) is “almost” the same as SU(2) x SU(2), 
and the latter is far from being a simple group. For example, the subgroup 
of pairs (v, 1) is a nontrivial normal subgroup, but clearly not the whole of 
SU(2) x SU(2). This gives us a way to show that SO(4) is not simple. 
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SO(4) is not simple. There is a nontrivial normal subgroup of SO(4), not 
equal to SO(4). 


Proof. The subgroup of pairs (v, 1) € SU(2) x SU(2) is normal; in fact, it 
is the kernel of the map (v,w) +> (1,w), which is clearly a homomorphism. 

The corresponding subgroup of SO(4) consists of maps of the form 
q++v_'ql, which likewise form a normal subgroup of SO(4). But this 
subgroup is not the whole of SO(4). For example, it does not include the 
map qg+> qw for any w £ +1, by the “unique up to sign” representation of 
rotations by pairs (v,w). 


Exercises 


An interesting subgroup Aut(H) of SO(4) consists of the continuous automor- 
phisms of Hi = R*. These are the continuous bijections p : Hi > HI that preserve 
the quaternion sum and product, that is, 


pP(p+q)=p(p)+p(q), p(eq)=pl(p)p(q) for any p,q € H. 


It is easy to check that, for each unit quaternion u, the p that sends g++ u~!qu 
is an automorphism (first exercise), so it follows from Section 1.5 that Aut(H) 
includes the SO(3) of rotations of the 3-dimensional subspace Ri+ Rj+ Rk of 
pure imaginary quaternions. The purpose of this set of exercises is to show that 
all continuous automorphisms of H are of this form, so Aut(H) = SO(3). 


2.7.1 Check that g > u-!qu is an automorphism of H for any unit quaternion uw. 
Now suppose that p is any automorphism of HH. 
2.7.2 Use the preservation of sums by an automorphism p to deduce in turn that 
e p preserves 0, that is, p(0) = 0, 
e p preserves differences, that is, p(p — g) = p(p) — p(q). 
2.7.3 Use preservation of products to deduce that 
e p preserves 1, that is, p(1) = 1, 
@ / preserves quotients, that is, p(p/q) = p(p)/p(q) for g 4 0. 


2.7.4 Deduce from Exercises 2.7.2 and 2.7.3 that p(m/n) = m/n for any integers 
mand n #0. This implies p(r) = r for any real r, and hence that p is a 
linear map of R*. Why? 


Thus we now know that a continuous automorphism p is a linear bijection 
of R* that preserves the real axis, and hence p maps Ri+ Rj+ Rk onto itself. It 
remains to show that the restriction of p to Ri+ Rj-+ Rk is a rotation, that is, an 
orientation-preserving isometry, because we know from Section 1.5 that rotations 
of Ri+ Rj-+ Rk are of the form gh u~! qu. 
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2.7.5 Prove in turn that 


e /p preserves conjugates, that is, p(G) = p(q), 
© / preserves distance, 
e p preserves inner product in Ri+ Rj+ Rk, 


e p(p xq) = p(p) x p(qg) in Ri+ Rj + Rk, and hence p preserves 
orientation. 


The appearance of SO(3) as the automorphism group of the quaternion al- 
gebra H suggests that the automorphism group of the octonion algebra O might 
also be of interest. It turns out to be a 14-dimensional group called G2—the first 
of the exceptional Lie groups mentioned (along with ©) in Section 1.6. This link 
between © and the exceptional groups was pointed out by Cartan [1908]. 


2.8 Discussion 


The concept of simple group emerged around 1830 from Galois’s theory 
of equations. Galois showed that each polynomial equation has a finite 
group of “symmetries” (permutations of its roots that leave its coefficients 
invariant), and that the equation is solvable only if its group decomposes 
in a certain way. In particular, the general quintic equation is not solvable 
because its group contains the nonabelian simple group As5—the group of 
even permutations of five objects. The same applies to the general equation 
of any degree greater than 5, because A, the group of even permutations 
of n objects, is simple for any n > 5. 

With this discovery, Galois effectively closed the classical theory of 
equations, but he opened the (much larger) theory of groups. Specifi- 
cally, by exhibiting the nontrivial infinite family A, for n > 5, he raised 
the problem of finding and classifying all finite simple groups. This prob- 
lem is much deeper than anyone could have imagined in the time of Galois, 
because it depends on solving the corresponding problem for continuous 
groups, or Lie groups as we now call them. 

Around 1870, Sophus Lie was inspired by Galois theory to develop an 
analogous theory of differential equations and their “symmetries,” which 
generally form continuous groups. As with polynomial equations, simple 
groups raise an obstacle to solvability. However, at that time it was not 
clear what the generalization of the group concept from finite to continuous 
should be. Lie understood continuous groups to be groups generated by 
“infinitesimal” elements, so he thought that the rotation group of R? should 
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include “infinitesimal rotations.” Today, we separate out the “infinitesimal 
rotations” of R? in a structure called s0(3), the Lie algebra of SO(3). The 
concept of simplicity also makes sense for s0(3), and is somewhat easier 
to establish. Indeed, the infinitesimal elements of any continuous group G 
form a structure g now called the Lie algebra of G, which captures most 
of the structure of G but is easier to handle. We discuss “infinitesimal 
elements,” and their modern counterparts, further in Section 4.3. 


It was a stroke of luck (or genius) that Lie decided to look at infinitesi- 
mal elements, because it enabled him to prove simplicity for whole infinite 
families of Lie algebras in one fell swoop. (As we will see later, most of 
the corresponding continuous groups are not quite simple, and one has to 
tease out certain small subgroups and quotient by them.) Around 1885 Lie 
proved results so general that they cover all but a finite number of simple 
Lie algebras—namely, those of the exceptional groups mentioned at the 
end of Chapter 1 (see Hawkins [2000], pp. 92-98). 


In the avalanche of Lie’s results, the special case of so(3) and SO(3) 
seems to have gone unnoticed. It gradually came to light as twentieth- 
century books on Lie theory started to work out special cases of geometric 
interest by way of illustration. In the 1920s, quantum physics also directed 
attention to SO(3), since rotations in three dimensions are physically sig- 
nificant. Still, it is remarkable that a purely geometric argument for the 
simplicity of SO(3) took so long to emerge. Perhaps its belated appear- 
ance is due to its topological content, namely, the step that depends purely 
on continuity. The argument hinges on the fact that 6 is a continuous func- 
tion of distance along the great circle PQ, and that such a function takes 
every value between its extreme values: the so-called intermediate value 
theorem. 


The theory of continuity (topology) came after the theory of continuous 
groups—not surprisingly, since one does not bother to develop a theory 
of continuity before seeing that it has some content—and applications of 
topology to group theory were rare before the 1920s. In this book we will 
present further isolated examples of continuity arguments in Sections 3.2, 
3.8, and 7.5 before taking up topology systematically in Chapter 8. 


Another book with a strongly geometric treatment of SO(3) is Berger 
[1987]. Volume I of Berger, p. 169, has a simplicity proof for SO(3) similar 
to the one given here, and it is extended to a simplicity result about SO(n), 
for n > 5, on p. 170: SO(2m + 1) is simple and the only nontrivial normal 
subgroup of SO(2m) is {+1}. We arrive at the same result by a different 
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route in Section 7.5. (Our route is longer, but it also takes in the complex 
and quaternion analogues of SO(n).) Berger treats SO(4) with the help 
of quaternions on p. 190 of his Volume II, much as we have done here. 
The quaternion representation of rotations of R* was another of Cayley’s 
discoveries, made in 1855. 

Lie observed the anomalous structure of SO(4) at the infinitesimal 
level. He mentions it, in scarcely recognizable form, on p. 683 of Volume 
II of his 1893 book Theorie der Transformationsgruppen. The anomaly of 
SO(4) is hidden in some modern treatments of Lie theory, where the con- 
cept of simplicity is superseded by the more general concept of semisim- 
plicity. All simple groups are semisimple, and SO(4) is semisimple, so an 
anomaly is removed by relaxing the concept of “simple” to “semisimple.” 
However, the concept of semisimplicity makes little sense before one has 
absorbed the concept of simplicity, and our goal in this book is to under- 
stand the simple groups, notwithstanding the anomaly of SO(4). 


3 


Generalized rotation groups 


PREVIEW 


In this chapter we generalize the plane and space rotation groups SO(2) 
and SO(3) to the special orthogonal group SO(n) of orientation-preserving 
isometries of IR” that fix O. To deal uniformly with the concept of “rota- 
tion” in all dimensions we make use of the standard inner product on R" 
and consider the linear transformations that preserve it. 

Such transformations have determinant +1 or —1 according as they 
preserve orientation or not, so SO(n) consists of those with determinant 1. 
Those with determinant +1 make up the full orthogonal group, O(n). 

These ideas generalize further, to the space C” with inner product de- 
fined by 


(uy ,U2,.--,Un) + (V1,¥2,---, Vn) = UIVy + UaV2 + +++ + UnVn- (*) 


The group of linear transformations of C” preserving (*) is called the uni- 
tary group U(n), and the subgroup of transformations with determinant 1 
is the special unitary group SU(n). 

There is one more generalization of the concept of isometry—to the 
space HI” of ordered n-tuples of quaternions. Hl” has an inner product de- 
fined like (*) (but with quaternion conjugates), and the group of linear 
transformations preserving it is called the symplectic group Sp(n). 

In the rest of the chapter we work out some easily accessible properties 
of the generalized rotation groups: their maximal tori, centers, and their 
path-connectedness. These properties later turn out to be crucial for the 
problem of identifying simple Lie groups. 


48 J. Stillwell, Naive Lie Theory, DOI: 10.1007/978-0-387-78214-0_3, 
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3.1 Rotations as orthogonal transformations 


It follows from the Cartan—Dieudonné theorem of Section 2.4 that a rota- 
tion about O in R? or R? is a linear transformation that preserves length 
and orientation. We therefore adopt this description as the definition of a 
rotation in R”. However, when the transformation is given by a matrix, 
it is not easy to see directly whether it preserves length or orientation. A 
more practical criterion emerges from consideration of the standard inner 
product in R”, whose geometric properties we now summarize. 

If u = (uy,u2,...,Un) and v = (v1, V2,...,¥_) are two vectors in R", 
their inner product u-v is defined by 


U-V= UV] FU2V2 +++ + UpVy. 
It follows immediately that 
u-u=up+u5+- tur = |ul?, 


so the length |u| of u (that is, the distance of u from the origin 0) is defin- 
able in terms of the inner product. It also follows (as one learns in linear 
algebra courses) that u- v = 0 if and only if u and v are orthogonal, and 
more generally that 

u-v= |u||v|cos@, 


where @ is the angle between the lines from 0 to u and 0 to v. Thus angle 
is also definable in terms of inner product. Conversely, inner product is 
definable in terms of length and angle. Moreover, an angle @ is determined 
by cos @ and sin@, which are the ratios of lengths in a certain triangle, so 
inner product is in fact definable in terms of length alone. 

This means that a transformation T preserves length if and only if T 
preserves the inner product, that is, 


T(u)-T(v)=u-v_ forall u,veR”. 


The inner product is a more convenient concept than length when one is 
working with linear transformations, because linear transformations are 
represented by matrices and the inner product occurs naturally within ma- 
trix multiplication: if A and B are matrices for which AB exists then 


(i, j)-element of AB = (row i of A) - (column j of B). 
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This observation is the key to the following concise and practical criterion 
for recognizing rotations, involving the matrix A and its transpose A‘. To 
state it we introduce the notation 1 for the identity matrix, of any size, 
extending the notation used in Chapter | for the 2 x 2 identity matrix. 


Rotation criterion. An n x n real matrix A represents a rotation of R" if 
and only if 
AA'=1 and det(A)=1. 


Proof. First we show that the condition AAT = 1 is equivalent to preserva- 
tion of the inner product by A. 


AA’ = 1 (row i of A) - (col j of A‘) = 6); 

where 6;; = 1 if i= j and 6;; =O if iF j 

(row i of A) - (row j of A) = 6; 

< rows of A form an orthonormal basis 

< columns of A form an orthonormal basis 
because AA! = 1 means A' = A7!,so 1 =A'A=A'(A?)!, 
and hence A! has the same property as A 

< A-images of the standard basis form an orthonormal basis 


< A preserves the inner product 


1 0 

0 . 
because Ae; -Ae; = 0;; = e;-e;, where e; = ()) re (:) are the 

0 
standard basis vectors of R”. 


Second, the condition det(A) = 1 says that A preserves orientation, as 
mentioned at the beginning of Section 2.5. Standard properties of determi- 
nants give 


det(AA') = det(A)det(A') and det(A‘) = det(A), 
so we already have 
1 = det(1) = det(AA') = det(A)det(A') = det(A)’. 


And the two solutions det(A) = 1 and det(A) = —1 occur according as A 
preserves orientation or not. 


A rotation matrix is called a special orthogonal matrix, presumably 
because its rows (or columns) form an orthonormal basis. The matrices 
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that preserve length, but not necessarily orientation, are called orthogonal. 
(However, orthogonal matrices are not the only matrices that preserve or- 
thogonality. Orthogonality is also preserved by the dilation matrices k1 for 
any nonzero constant k.) 


Exercises 


3.1.1 Give an example of a matrix in O(2) that is not in SO(2). 


3.1.2 Give an example of a matrix in O(3) that is not in SO(3), and interpret it 
geometrically. 


3.1.3 Work out the matrix for the reflection of R? in the plane through O orthog- 
onal to the unit vector (a,b,c). 


3.2 The orthogonal and special orthogonal groups 


It follows from the definition of special orthogonal matrices that: 


e If A, and A; are orthogonal, then A;Al =1 and AoA} = 1. It follows 
that the product A,Az satisfies 
(AjA2)(AjA2)' = AjA2ATAT because (A1A2)' = AFAT, 
=A,A} because A2A} = 1, 
=1 because A,A} =1. 


e If A, and Az are special orthogonal, then det(A;) = det(Az) = 1, so 


det(A1A2) = det(A,) det(A2) =I. 


e If A is orthogonal, then AA’ = 1, hence A~! =A’. It follows that 
(A~!)T = (AT)T =A, so A”! is also orthogonal. And A~! is special 
orthogonal if A is because 


det(A~!) = det(A)~! = 1. 


Thus products and inverses of n x n special orthogonal matrices are special 
orthogonal, and hence they form a group. This group (the “rotation” group 
of RR”) is called the special orthogonal group SO(n). 

If we drop the requirement that orientation be preserved, then we get 
a larger group of transformations of R” called the orthogonal group O(n). 
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An example of a transformation that is in O(n), but not in SO(n), is re- 


flection in the hyperplane orthogonal to the x-axis, (x1,X2,X3,...,Xn) @ 
(—21,%2,%3,---;Xn), which has the matrix 
—-1 0 0 
0 1 0 
) 
0 O 1 


obviously of determinant —1. We notice that the determinant of a matrix 
A € O(n) is +1 because (as mentioned in the previous section) 


AA! =1 => 1 = det(AA‘) = det(A) det(A‘) = det(A)?. 


Path-connectedness 


The most striking difference between SO(n) and O(n) is a topological one: 
SO(n) is path-connected and O(n) is not. That is, if we view n x n matrices 
as points of R” in the natural way—by interpreting the n* matrix entries 
11,412,---,41n,421,--+,;42n,+-+yAn15+++;4mn as the coordinates of a point— 
then any two points in SO(n) may be connected by a continuous path in 
SO(n), but the same is not true of O(n). Indeed, there is no continuous 
path in O(n) from 


1 —1 


to 


(where the entries left blank are all zero) because the value of the determi- 
nant cannot jump from | to —1 along a continuous path. 

The path-connectedness of SO(n) is not quite obvious, but it is inter- 
esting because it reconciles the everyday concept of “rotation” with the 
mathematical concept. In mathematics, a rotation of IR” is given by speci- 
fying just one configuration, usually the final position of the basis vectors, 
in terms of their initial position. This position is expressed by a matrix 
A. In everyday speech, a “rotation” is a movement through a continuous 
sequence of positions, so it corresponds to a path in SO(n) connecting the 
initial matrix 1 to the final matrix A. 
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Thus a final position A of IR” can be realized by a “rotation” in the 
everyday sense of the word only if SO(n) is path-connected. 


Path-connectedness of SO(n). For any n, SO(n) is path-connected. 


Proof. For n = 2 we have the circle SO(2), which is obviously path- 
connected (Figure 3.1). Now suppose that SO(n — 1) is path-connected 
and that A € SO(n). It suffices to find a path in SO(n) from 1 to A, because 
if there are paths from 1 to A and B then there is a path from A to B. 


a ale ~.. cos@+isind 


Figure 3.1: Path-connectedness of SO(2). 


This amounts to finding a continuous motion taking the basis vectors 
€1,€2,...,€, to their final positions Ae;,Aez,...,Ae, (the columns of A). 

The vectors e; and Ae, (if distinct) define a plane Y, so, by the path- 
connectedness of SO(2), we can move e; continuously to the position Ae; 
by a rotation R of Y. It then suffices to continuously move Re2,..., Re, to 
Aez,...,Ae,, respectively, keeping Ae; fixed. Notice that 


e Rep,...,Re, are all orthogonal to Re; = Ae,, because e2,...,e, are 
all orthogonal to e; and R preserves angles. 


e Aez,...,Ae,, are all orthogonal to Ae;, because e2,...,€, are all or- 
thogonal to e; and A preserves angles. 


Thus the required motion can take place in the R”~! of vectors orthogonal 
to Ae), where it exists by the assumption that SO(n — 1) is path-connected. 

Performing the two motions in succession—taking e; to Ae; and then 
Re,...,Re, to Aeo,...,Ae,—gives a path from 1 to A in SO(n). 


The idea of path-connectedness will be explored further in Sections 3.8 
and 8.6. In the meantime, the idea of continuous path is used informally in 
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the exercises below to show that path-connectedness has interesting alge- 
braic implications. 


Exercises 


The following exercises study the identity component in a matrix group G, that is, 
the set of matrices A € G for which there is a continuous path from 1 to A that lies 
inside G. 


3.2.1 Bearing in mind that matrix multiplication is a continuous operation, show 
that if there are continuous paths in G from 1 to A € G and to B € G then 
there is a continuous path in G from A to AB. 


3.2.2 Similarly, show that if there is a continuous path in G from 1 to A, then 
there is also a continuous path from A~! to 1. 


3.2.3, Deduce from Exercises 3.2.1 and 3.2.2 that the identity component of G is 
a subgroup of G. 


3.3. The unitary groups 


The unitary groups U(n) and SU(n) are the analogues of the orthogonal 
groups O(n) and SO(n) for the complex vector space C", which consists 
of the ordered n-tuples (z1,Z2,...,Zn) of complex numbers. The sum oper- 
ation on C” is the usual vector addition: 


(uy, U2,---,Un) + (V1, V2,---;Vn) = (Uy +V1,U2 +V2,---,Un+Vn)- 


And the multiple of (z1,22,...,2n) € C” by a scalar c € C is naturally 
(€Z1,CZ2,---,CZn). The twist comes with the inner product, because we 
would like the inner product of a vector v with itself to be a real number— 
the squared distance |v| from the zero matrix 0 to v. We ensure this by 
the definition 


(uy ,U2,---,Un)*(V1,V25---5¥n) = UVi + ugVZ+++++unva. —(*) 
With this definition of u- v we have 
VeV= Vt vere t ++ + vv = |vi[? + [v2|? +--+ [ynl? = [vl?, 


and |v|* is indeed the squared distance of v = (v1,v2,...,V,) from 0 in the 
space R?” that equals C” when we interpret each copy of C as R?. 
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The kind of inner product defined by (*) is called Hermitian (after the 
nineteenth-century French mathematician Charles Hermite). Just as one 
meets ordinary inner products of rows when forming the product 


AA, for areal matrix A, 


so too one meets the Hermitian inner product (*) of rows when forming the 
product 
<T 
AA , foracomplex matrix A. 


Here A denotes the result of replacing each entry a; ; of A by its complex 
conjugate G;j. 

With this adjustment the arguments of Section 3.1 go through, and one 
obtains the following theorem. 


Criterion for preserving the inner product on C”. A linear transforma- 
tion of C" preserves the inner product (*) if and only if its matrix A satisfies 
<T . ; : : 
AA =1, where 1 is the identity matrix. 


As in Section 3.1, one finds that the rows (or columns) of A form an 
orthonormal basis of C”. The rows yj are “normal” in the sense that |vj| = 
1, and “orthogonal” in the sense that v;- v; = 0 when i 4 j, where the dot 
denotes the inner product (*). 

It is clear that if linear transformations preserve the inner product (*) 
then their product and inverses also preserve (*), so the set of all transfor- 
mations preserving (*) is a group. This group is called the unitary group 
U(n). The determinant of an A in U(n) has absolute value | because 


AA’ =1=> 1 =det(AA') = det(A) det(A') = det(A)det(A) = | det(A)|?, 


and it is easy to see that det(A) can be any number with absolute value 1. 
The subgroup of U(n) whose members have determinant | is called the 
special unitary group SU(n). 
We have already met one SU(), because the group of unit quaternions 


(3 “4 . “where & Be Cand al? LipP =, 
B @ 

is none other than SU(2). The rows (a,—B) and (B,@) are easily seen 
to form an orthonormal basis of C?. Conversely, (@,—f) is an arbitrary 
unit vector in C*, and (B, @) is the unique unit vector orthogonal to it that 
makes the determinant equal to 1. 
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Path-connectedness of SU(n) 


We can prove that SU(n) is path-connected, along similar lines to the proof 
for SO(n) in the previous section. The proof is again by induction on n, 
but the case n = 2 now demands a little more thought. It is helpful to use 
the complex exponential function e“, which we take to equal cosx + isinx 


by definition for now. (In Chapter 4 we study exponentiation in depth.) 
Given G =) in SU(2), first note that (a, B) is a unit vector in C’, 
so & = ucos@ and B = vsin@ for some u,v in C with |u| = |v| = 1. This 
means that u = e’® and v = e’Y for some @,y ER. 


It follows that 
a(t)=e' cos@t, B(t)=esinér, for O<r<1, 


gives a continuous path Ge an) from 1 to (F 2) in SU(2). Thus 
SU(2) is path-connected. 


Exercises 


Actually, SU(2) is not the only special unitary group we have already met, though 
the other one is less interesting. 


3.3.1. What is SU(1)? 


The following exercises verify that a linear transformation of C”, with matrix 


A, preserves the Hermitian inner product (*) if and only if AA’ =1, They can be 
proved by imitating the corresponding steps of the proof in Section 3.1. 


3.3.2 Show that vectors form an orthonormal basis of C” if and only if their 
conjugates form an orthonormal basis, where the conjugate of a vector 
(U1 ,U2,--.,Un) is the vector (77,7, ...,i). 


3.3.3 Show that AA’ = 1 if and only if the row vectors of A form an orthonormal 
basis of C”. 


3.3.4 Deduce from Exercises 3.3.2 and 3.3.3 that the column vectors of A form 
an orthonormal basis. 


3.3.5 Show that if A preserves the inner product (*) then the columns of A form 
an orthonormal basis. 


3.3.6 Show, conversely, that if the columns of A form an orthonormal basis, then 
A preserves the inner product (*). 
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3.4 The symplectic groups 


On the space HI” of ordered n-tuples of quaternions there is a natural inner 
product, 


(D1, P2y+++>Pn)*(G1,929-6+54n) = PGi t+ P2gat-+ Png. (**) 


This of course is formally the same as the inner product (*) on C”, ex- 
cept that the p; and q; now denote arbitrary quaternions. The space Hl” 
is not a vector space over H, because the quaternions do not act correctly 
as “scalars”: multiplying a vector on the left by a quaternion is in general 
different from multiplying it on the right, because of the noncommutative 
nature of the quaternion product. 

Nevertheless, quaternion matrices make sense (thanks to the associa- 
tivity of the quaternion product, we still get an associative matrix product), 
and we can use them to define linear transformations of HI”. Then, by spe- 
cializing to the transformations that preserve the inner product (**), we get 
an analogue of the orthogonal group for Hl” called the symplectic group 
Sp(n). As with the unitary groups, preserving the inner product implies 
preserving length in the corresponding real space, in this case in the space 
R* corresponding to H”. 

For example, Sp(1) consists of the 1 x 1 quaternion matrices, multipli- 
cation by which preserves length in H = R*. In other words, the members 
of Sp(1) are simply the unit quaternions. Because we defined quaternions 
in Section 1.3 as the 2 x 2 complex matrices 


a+id —b—ic 
b—ic a-—id }’ 
it follows that 


_ atid —b—ic\ 4.42, 2 2 _ 
stl) ={ (58 ie ra +b°+ce°+d°=1> =SU(2). 


Thus we have already met the first symplectic group. 

The quaternion matrices A in Sp(n), like the complex matrices in 
SU(n), are characterized by the condition AA = 1, where the bar now 
denotes the quaternion conjugate. The proof is the same as for SU(n). 
Because of this formal similarity, there is a proof that Sp(n) is path- 
connected, similar to that for SU(7) given in the previous section. 
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However, we avoid imposing the condition det(A) = 1, because there 
are difficulties in the very definition of determinant for quaternion matrices. 
We sidestep this problem by interpreting all n x n quaternion matrices as 
2n x 2n complex matrices. 


The complex form of Sp(7) 


In Section 1.3 we defined quaternions as the complex 2 x 2 matrices 


atid —b—ic a —B 
= =(> _ for a,B €C. 
q Ge ca) (5 a iB 
Thus the entries of a quaternion matrix are themselves 2 x 2 matrices q. 
Thanks to a nice feature of the matrix product—that it admits block multi- 
plication—we can omit the parentheses of each matrix g. Then it is natural 


to define the complex form, C(A), of a quaternion matrix A to be the result 
of replacing each quaternion entry g in A by the 2 x 2 block 


Notice also that the transposed complex conjugate of this block corre- 
sponds to the quaternion conjugate of q: 


__(a-id b+ic\ (@ B 
I~ \—btic atid) ~ —B a) 


Therefore, if A is a quaternion matrix such that AA’ = 1, it follows by 
block multiplication (and writing 1 for any identity matrix) that 


C(A)C(A). =C(AA") =C(1) =1. 
Thus C(A) is a unitary matrix. 
Conversely, if A is a quaternion matrix for which C(A) is unitary, then 
AA’ = 1. This follows by viewing the product AA' of quaternion matrices 


as the product C (A)C(A) of complex matrices. Therefore, the group Sp(n) 
consists of those n x n quaternion matrices A for which C(A) is unitary. 

It follows, if we define the complex form of Sp(n) to be the group of 
matrices C(A) for A € Sp(n), that the complex form of Sp(n) consists of the 
unitary matrices of the form C(A), where A is an n x n quaternion matrix. 
In particular, the complex form of Sp(n) is a subgroup of U(2n). 
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Many books on Lie theory avoid the use of quaternions, and define 
Sp(n) as the group of unitary matrices of the form C(A). This gets around 
the inconvenience that HI” is not quite a vector space over H (mentioned 
above) but it breaks the simple thread joining the orthogonal, unitary, and 
symplectic groups: they are the “generalized rotation” groups of the spaces 
with coordinates from R, C, and H, respectively. 


Exercises 


It is easy to test whether a matrix consists of blocks of the form 


Nevertheless, it is sometimes convenient to describe the property of “being of the 
form C(A)” more algebraically. One way to do this is with the help of the special 


matrix 
0 1 
os, 


3.4.1 IFB= (5 y) show that JBJ~! =B. 
B @ 
3.4.2. Conversely, show that if JBJ~! = B and B = (: ; then we have ¢ = f 


and d = —e, so B has the form = =e : 
B @ 


Now suppose that Bz, is any 2n x 2n complex matrix, and let 


, where 0 is the 2 x 2 zero matrix. 


3.4.3 Use block multiplication, and the results of Exercises 3.4.1 and 3.4.2, to 
show that By, has the form C(A) if and only if Jon BonJ5,' = Bon- 


The equation satisfied by J and Bz, enables us to derive information about det(B2,) 
(thus getting around the problem with the determinant of a quaternion matrix). 


3.4.4 By taking det of both sides of the equation in Exercise 3.4.3, show that 
det(B2,) is real. 


3.4.5 Assuming now that B2, is in the complex form of Sp(n), and hence is uni- 
tary, show that det(B2,,) = +1. 
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One can prove Sp(n) is path-connected by an argument like that used for 
SU(n) in the previous section. First prove path-connectedness of Sp(2) as for 
SU(2), using a result from Section 4.2 that each unit quaternion is the exponential 
of a pure imaginary quaternion. 


3.4.6 Deduce from the path-connectedness of Sp() that det(B2,,) = 1. 


This is why there is no “special symplectic group”—the matrices in the symplectic 
group already have determinant |, under a sensible interpretation of determinant. 


3.5 Maximal tori and centers 


The main key to understanding the structure of a Lie group G is its maximal 
torus, a (not generally unique) maximal subgroup isomorphic to 


T‘=S!xS!x.--xS!  (k-fold Cartesian product) 


contained in G. The group T* is called a torus because it generalizes the 
ordinary torus T? = S! x S!. An obvious example is the group SO(2) = S!, 
which is its own maximal torus. For the other groups SO(), not to mention 
SU(n) and Sp(n), maximal tori are not so obvious, though we will find 
them by elementary means in the next section. To illustrate the kind of 
argument involved we first look at the case of SO(3). 


Maximal torus of SO(3) 


If we view SO(3) as the rotation group of IR3, and let e;, eo, and e3 be the 
standard basis vectors, then the matrices 


cos@ —sin@ O 
R,o= | sin@ cos@ 0 
0 0 1 


form an obvious T! = S! in SO(3). The matrices Ry are simply rotations 
of the (e;, 2 )-plane through angle 0, which leave the e3-axis fixed. 

If T is any torus in G that contains this T! then, since any torus is 
abelian, any A € T commutes with all Ry € T!. We will show that if 


AR, =R,A forall R,<€T' (*) 
then A € T!, so T= T! and hence T! is maximal. It suffices to show that 


A(e1), A(e2) € (€1, €2)-plane, 
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because in that case A is an isometry of the (e;,e2)-plane that fixes O. The 
only such isometries are rotations and reflections, and the only ones that 
commute with all rotations are rotations themselves. 

So, suppose that 


A(e1) = ajej +.a2€2 + 433. 


By the hypothesis (*), A commutes with all R, and in particular with 


-1 0 0O 
R, =| 0 -1 0 
0) 0 1 
Now we have 
AR;,(e1) = A(—e1) = —a1e1 — aze2 — a3@3, 
R’A(e1) = Ri, (aie + aze2 + a3e3) = —aje; — d2@2 + 43e3, 


so it follows from AR’, = RA that a3 = 0 and hence 
A(e1) € (€1,€2)-plane. 
A similar argument shows that 


A(e2) € (e1,€2)-plane, 


which completes the proof that T! is maximal in SO(3). 


An important substructure of G revealed by the maximal torus is the 
center of G, a subgroup defined by 


Z(G) = {A € G: AB = BA for all B € G}. 


(The letter Z stands for “Zentrum,” the German word for “center.”) It is 
easy to check that Z(G) is closed under products and inverses, and hence 
Z(G) is a group. We can illustrate how the maximal torus reveals the center 
with the example of SO(3) again. 


Center of SO(3) 


An element A € Z(SO(3)) commutes with all elements of SO(3), and in 

particular with all elements of the maximal torus T!. The argument above 

then shows that A fixes the basis vector e3. Interchanging basis vectors, we 

likewise find that A fixes e; and e7. Hence A is the identity rotation 1. 
Thus Z(SO(3)) = {1}. 
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Exercises 


The 2-to-1 map from SU(2) to SO(3) ensures that the maximal torus and center 
of SU(2) are similar to those of SO(3). 


3.5.1 Give an example of a T! in SU(2). 


3.5.2 Explain why a T? in SU(2) yields a T? in SO(3), so T! is maximal in 
SU(2). (Hint: Map each element g of the T? in SU(2) to the pair +g in 
SO(3), and look at the images of the S! factors of T?.) 


3.5.3 Explain why Z(SU(2)) = {+1}. 


The center of SO(3) can also be found by a direct geometric argument. 


3.5.4 Suppose that A is a rotation of R3, about the e;-axis, say, that is not the 
identity and not a half-turn. Explain (preferably with pictures) why A does 
not commute with the half-turn about the e3-axis. 


3.5.5 If A is a half-turn of R? about the e)-axis, find a rotation that does not 
commute with A. 


In Section 3.7 we will show that Z(SO(2m + 1)) = {1} for all m. However, 
the situation is different for SO(2m). 


3.5.6 Give an example of a nonidentity element of Z(SO(2m)) for each m > 2. 


3.6 Maximal tori in SO(n), U(n), SU(n), Sp(n) 


The one-dimensional torus T! = S! appears as a matrix group in several 
different guises: 


e asa group of 2 x 2 real matrices 
a ee rg 
sin@ cos@ 
e as a group of complex numbers (or 1 x 1 complex matrices) 
Zg = cos@+isin8, 


e as a group of quaternions (or 1 x 1 quaternion matrices) 


qo =cos@+isin@. 


Each of these incarnations of T! gives rise to a different incarnation of T*: 
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e asa group of 2k x 2k real matrices 


Ro, ,0,...,0% = 
cos@; —sin0; 
sin@; cos0; 
cos@, —sin@ 
sin@, cos@) 


cosQ, —sin& 
sin@, cos & 


where all the blank entries are zero, 


e asa group of k x k unitary matrices 


Z6),0>,...,0% = a ; 


oi 


where all the blank entries are zero and e! = cos @ + isin 9, 


e as a group of k x k symplectic matrices 


26, ,69,...,6% = 7 


where all the blank entries are zero and e!® = cos@ +isin@. (This 
generalization of the exponential function is justified in the next 
chapter. In the meantime, e!? may be taken as an abbreviation for 
cos 86 + isin 0.) 


We can also represent T* by larger matrices obtained by “padding” the 
above matrices with an extra row and column, both consisting of zeros 
except for a 1 at the bottom right-hand corner (as we did to produce the 
matrices R’, in SO(3) in the previous section). Using this idea, we find the 
following tori in the groups SO(2m), SO(2m+ 1), U(n), SU(n), and Sp(n). 
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In SO(2m) we have the T” consisting of the matrices Rg, 6,9, In 
SO(2m + 1) we have the T’” consisting of the “padded” matrices 


cos@, —sin0; 

sin@,; cos@; 
cos@, —sin@, 
sin@, cos@) 


cosQ, —sin & 
sin, cos O& 
1 


In U(n) we have the T” consisting of the matrices Zg, 6,,...9,. In SU(n) we 
have the T”~! consisting of the Zo, 6,....9, with 6; + @2+--- +0, =0. The 
latter matrices form a T”~! because 


ei(91—8n) 
el n-1 ~ Ci) 
ln 1 


and the matrices on the right clearly form a T”~!. Finally, in Sp(n) we 
have the T” consisting of the matrices Q6, 6,,....6,- 

We now show that these “obvious” tori are maximal. As with SO(3), 
used as an illustration in the previous section, the proof in each case con- 
siders a matrix A € G that commutes with each member of the given torus 


T, and shows that A € T. 


Maximal tori in generalized rotation groups. The tori listed above are 
maximal in the corresponding groups. 


Proof. Case (1): T’” in SO(2m), for m > 2. 
If we let e;,€2,...,€2, denote the standard basis vectors for IR”, then 
the typical member Rg, .6,....,6,, of T’” is the product of the following plane 


rotations, each of which fixes the basis vectors orthogonal to the plane: 


rotation of the (e;,e2)-plane through angle 6,, 
rotation of the (e3,e4)-plane through angle 62, 


rotation of the (€2—1,€2m)-plane through angle 0,,. 
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Now suppose that A € SO(2m) commutes with each Rg, @,....,6,,- We are 
going to show that 


A(e;),A(e2) € (e1,e2)-plane, 
A(e3),A(e4) € (e3,e4)-plane, 


A(€2m—1),A(€2m) cs (€2m—1,€2m)-plane, 

from which it follows that A is a product of rotations of these planes, and 
hence is a member of T’”’. (The possibility that A reflects some plane Y is 
ruled out by the fact that A commutes with all members of T”, including 
those that rotate only the plane Y. Then it follows as in the case of SO(3) 
that A rotates Y.) 

To show that A maps the basis vectors into the planes claimed, it suf- 
fices to show that A(e;) € (e;,e2)-plane, since the other cases are similar. 


So, suppose that 
AR 6) ,6),...,0m = R6,,02,..,9n4 forall Ro,,6...,6, €T, 


m 


and in particular that 
ARx0,...0(€1) = Rr9.....0A(€1)- 


Then if A(e;) = aye; + a2€2 +--+ + d2m€2m. we have 


ARzo,...,0(€1) = A(—e1) = —aye) — aze2 — a3e3 + - — dym€2m, 
but 
Rro,...0A(€1) = —a1e1 — aze2 + a3€3 + «+--+ €2m€2m, 
whence a3 = a4 = -:: = dom = 0, as required. 


The argument is similar for any other ex. Hence A € T’”, as claimed. 

Case (2): T” in SO(2m + 1). 

In this case we generalize the argument for SO(3) from the previous 
section, using maps such as R79 9 in place of Ry. 

Case (3): T” in U(n). 

Let e;,€2,...,€, be the standard basis vectors of C”, and suppose that 
A commutes with each element Zo, 9,,...9, of T”. In particular, A commutes 
with 


er 
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Then if A(e;) = aye] + ---+a,e, we have 


AZz0,....0(€1) = A(—e1) = —ayey — +++ — any 


Zn0,...0A(€1) = Zrro,...,0(a1€1 +++ + Gn€n) = —ayey +--+ + ann, 


whence it follows that az = --- =a, = 0. 

Thus A(e;) = cj e; for some c; € C, and a similar argument shows that 
A(ex) = cxex for each k. Also, A(e;),...A(e,) are an orthonormal basis, 
since A € U(n). Hence each |c;| = 1, so cy, = e' and therefore A € T”. 

Case (4): T’~! in SU(n). 

For n > 2 we can argue as for U(n), except that we need to commute 
A with both Zz,7.0,....0 and Zz,z,....9 to conclude that A(e;) = cje;. This is 
because Z;.00,...,0 is not in SU(n), since it has determinant —1. 

For n = 2 we can argue as follows. 


Suppose A = ee B) commutes with each Zg 9 € T!. In particular, A 


commutes with 
i O 
Zn/2,—-1/2 = 0 = ? 


ai —bi\ [ai bi 
ci —di)  \-ci —di)’ 
It follows that b = c = 0 and hence A € T!. 


Case (5): T” in Sp(n). 
Here we can argue exactly as in Case (3). 


which implies that 


Exercises 
3.6.1 Viewing C” as R2”, show that Z6, ,0>,...,0n 18 the same isometry as Ro, ,.....6,- 
3.6.2. Use Exercise 3.6.1 to give another proof that T” is a maximal torus of U(n). 


3.6.3, Show that the maximal tori found above are in fact maximal abelian sub- 
groups of SO(n), U(n), SU(n), Sp(n). 


We did not look for a maximal torus in O(7) because the subgroup SO(n) is of 
more interest to us, but in any case it easy to find a maximal torus in O(n). 


3.6.4 Explain why a maximal torus of O(7) is also a maximal torus of SO(7). 
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3.7 Centers of SO(n), U(n), SU(n), Sp(n) 


The arguments in the previous section show that an element A in G = 
SO(n),U(n),SU(n),Sp(n) that commutes with all elements of a maximal 
torus T in Gis in fact in T. It follows that if A commutes with all elements 
of G then A € T. Thus we can assume that elements A of the center Z(G) 
of G have the special form known for members of T. This enables us to 
identify Z(G) fairly easily when G = SO(n), U(n),SU(n), Sp(n). 


Centers of generalized rotation groups. The centers of these groups are: 


(1) Z(SO(2m)) = {+1}. 


(2) Z(SO(2m+ 1)) = {1}. 

(3) Z(U(n)) = {o1 : |o| = 1}. 

(4) Z(SU(n)) = {@1: @” = 1}. 
( 


(5) Z(Sp(n)) = {+1}. 


Proof. Case (1): A € Z(SO(2m)) for m > 2. 


92 5200)Un 


99290009 Un 


quence of 2 x 2 blocks (placed along the diagonal) of the form 
R cos@ —sin@ 
9 \sin@ cos }” 


We notice that Rg does not commute with the matrix 


. fi 0 
r=(0 4) 


unless sin@ = O and hence cos@ = +1. Therefore, if we build a matrix 
[5, © SO(2m) with copies of J* on the diagonal, Re, 6,.....9, Will commute 
with J;,,, only if each sin @, = 0 and cos @ = +1. 


Thus a matrix A in Z(SO(2m)) has diagonal entries +1 and zeros else- 
where. Moreover, if both +1 and —1 occur we can find a matrix in SO(2m) 
that does not commute with A; namely, a matrix with Rg on the diagonal at 
the position of an adjacent +1 and —1 in A, and otherwise only 1’s on the 
diagonal. So, in fact, A = 1 or A = —1. Both 1 and —1 belong to SO(2m), 
and they obviously commute with everything, so Z(SO(2m)) = {+1}. 
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Case (2): A € Z(SO(2m + 1)). 

The argument is very similar to that for Case (1), except for the last 
step. The (2m-+ 1) x (2m+ 1) matrix —1 does not belong to SO(2m + 1), 
because its determinant equals —1. Hence Z(SO(2m+ 1)) = {1}. 


Case (3): A € Z(U(n)). 
all elements of U(n). If n = 1 then U(n) is isomorphic to the abelian group 
S! = {e!? : @ ER}, so U(1) is its own center. If n > 2 we take advantage 
of the fact that 


er 0 d t te with 4 
0 eib oes not commute wl 1 0 


unless e!% = e’®, It follows, by building a matrix with C a) somewhere 
on the diagonal and otherwise only Is on the diagonal, that A = Zo, 6,..6, 
must have e!®! BO se ose gi, 

In other words, elements of Z(U(n)) have the form e'1. Conversely, 
all matrices of this form are in U(n), and they commute with all other 
matrices. Hence 


Z(U(n)) = {e1: 0 ER} = {1 : |o| = 1}. 


Case (4): A € Z(SU(n)). 
The argument for U(n) shows that A must have the form @1, where 
|@| = 1. But in SU(n) we must also have 


(=det(A)=a". 


=e 


This means that @ is one of the n “roots of unity” 
ee. er. oa 1. 
All such matrices @1 clearly belong to SU(n) and commute with every- 
thing, hence Z(SU(n)) = {@1: @" = 1}. 
Case (5): A € Z(Sp(n)). 

with all elements of Sp(n). The argument used for U(n) applies, up to the 
point of showing that all matrices in Z(Sp(n)) have the form qg1, where 
|q| = 1. But now we must bear in mind that quaternions qg do not generally 
commute. Indeed, only the real quaternions commute with all the others, 
and the only real quaternions g with |q| = 1 are g = 1 and g = —1. Thus 


Z(Sp(n)) = {+1}. 
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Exercises 


It happens that the quotient of each of the groups SO(n), U(n), SU(n), Sp(n) by 
its center is a group with trivial center (see Exercise 3.8.1). However, it is not 
generally true that the quotient of a group by its center has trivial center. 


3.7.1 Find the center Z(G) of G = {1,—1,i, —i,j, —j,k, —k} and hence show that 
G/Z(G) has nontrivial center. 


3.7.2 Prove that U(n) /Z(U(n)) = SU(n)/Z(SU(n)). 
3.7.3 Is SU(2)/Z(SU(2)) = SO(3)? 


3.7.4 Using the relationship between U(n), Z(U(n)), and SU(n), or otherwise, 
show that U(n) is path-connected. 


3.8 Connectedness and discreteness 


Finding the centers of SO(n), U(n), SU(m), and Sp(n) is an important step 
towards understanding which of these groups are simple. The center of 
any group G is a normal subgroup of G, hence G cannot be simple unless 
Z(G) = {1}. This rules out all of the groups above except the SO(2m + 1). 
Deciding whether there are any other normal subgroups of SO(2m + 1) 
hinges on the distinction between discrete and nondiscrete subgroups. 

A subgroup H of a matrix Lie group G is called discrete if there is a 
positive lower bound to the distance between any two members of H, the 
distance between matrices (a;;) and (b;;) being defined as 


(We say more about the distance between matrices in the next chapter.) In 
particular, any finite subgroup of G is discrete, so the centers of SO(n), 
SU(n), and Sp(n) are discrete. On the other hand, the center of U(n) is 
clearly not discrete, because it includes elements arbitrarily close to the 
identity matrix. 

In finding the centers of SO(n), SU(n), and Sp(n) we have in fact found 
all their discrete normal subgroups, because of the following remarkable 
theorem, due to Schreier [1925]. 


Centrality of discrete normal subgroups. /[f G is a path-connected matrix 
Lie group with a discrete normal subgroup H, then H is contained in the 
center Z(G) of G. 
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Proof. Since H is normal, BAB~! € H for each A € H and BE G. Thus 
B+ BAB"! defines a continuous map from G into the discrete set H. Since 
G is path connected, and a continuous map sends paths to paths, the image 
of the map must be a single point of H. This point is necessarily A because 
1+ 1A1! =A. 

In other words, each A € H has the property that BA = AB for all BE G. 
That is, A € Z(G). 


The groups SO(n), SU(n), and Sp() are path-connected, as we have 
seen in Sections 3.2, 3.3, and 3.4, so all their discrete normal subgroups 
are in their centers, determined in Section 3.7. In particular, SO(2m + 1) 
has no nontrivial discrete normal subgroup, because its center is {1}. 

It follows that the only normal subgroups we may have missed in 
SO(n), SU(n), and Sp(n) are those that are not discrete. In Section 7.5 
we will establish that such subgroups do not exist, so all normal sub- 
groups of SO(n), SU(n), and Sp(n) are in their centers. In particular, 
the groups SO(2m + 1) are all simple, and it follows from Exercise 3.8.1 
below that the rest are simple “modulo their centers.” That is, for G = 
SO(2m),SU(n),Sp(n), the group G/Z(G) is simple. 


Exercises 


3.8.1 If Z(G) is the only nontrivial normal subgroup of G, show that G/Z(G) is 
simple. 


The result of Exercises 3.2.1, 3.2.2, 3.2.3 can be improved, with the help of 
some ideas used above, to show that the identity component is a normal subgroup 
of G. 


3.8.2 Show that, if H is a subgroup of G and AHA~! CH for each A € G, then H 
is a normal subgroup of G. 


3.8.3 If G is a matrix group with identity component H, show that AHA! C H 
for each matrix A € G. 


The proof of Schreier’s theorem assumes only that there is no path in H be- 
tween two distinct members, that is, H is totally disconnected. Thus we have 
actually proved: if G is a path-connected group with a totally disconnected nor- 
mal subgroup H, then H is contained in Z(G). We can give examples of totally 
disconnected subgroups that are not discrete. 


3.8.4 Show that the subgroup H = {cos2ar+ isin2zr : r rational} of the circle 
SO(2) is totally disconnected but dense, that is, each arc of the circle con- 
tains an element of H. 
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This example is also a normal subgroup. However, normal, dense, totally 
disconnected subgroups are rare. 


3.8.5 Explain why there is no normal, dense, totally disconnected subgroup of 
SO(n) for n > 2. 


3.9 Discussion 


The idea of treating orthogonal, unitary, and symplectic groups uniformly 
as generalized isometry groups of the spaces R”, C”, and H” seems to 
be due to Chevalley [1946]. Before the appearance of Chevalley’s book, 
the symplectic group Sp(n) was generally viewed as the group of unitary 
transformations of C?” that preserve the symplectic form 


(ono — BiB{) + +--+ (0%, — BnBy), 


where (01, B1,...,Q@n, By) is the typical element of C7”. This element cor- 
responds to the element (q1,...,¢n) of HI”, where 


The invariance of the quaternion inner product 
gig ++ +4ngh 


is therefore equivalent to the invariance of the matrix product 


which turns out to be equivalent to the invariance of the symplectic form. 
The word “symplectic” itself was introduced by Hermann Wey] in his book 
The Classical Groups, Wey] [1939], p. 165: 


The name “complex group” formerly advocated by me in al- 
lusion to line complexes, as these are defined by the vanishing 
of antisymmetric bilinear forms, has become more and more 
embarrassing through collision with the word “complex” in 
the connotation of complex number. I therefore propose to re- 
place it with the corresponding Greek adjective “symplectic.” 
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Maximal tori were also introduced by Weyl, in his paper Weyl [1925]. 
In this book we use them only to find the centers of the orthogonal, unitary, 
and symplectic groups, since the centers turn out to be crucial in the inves- 
tigation of simplicity. However, maximal tori themselves are important for 
many investigations in the structure of Lie groups. 

The existence of a nontrivial center in SO(2m), SU(n), and Sp(n) 
shows that these groups are not simple, since the center is obviously a 
normal subgroup. Nevertheless, these groups are almost simple, because 
the center is in each case their largest normal subgroup. We have shown in 
Section 3.8 that the center is the largest normal subgroup that is discrete, 
in the sense that there is a minimum, nonzero, distance between any two 
of its elements. It therefore remains to show that there are no nondiscrete 
normal subgroups, which we do in Section 7.5. 

It turns out that the quotient groups of SO(2m), SU(n), and Sp(n) by 
their centers are simple and, from the Lie theory viewpoint, taking these 
quotients makes very little difference. The center is essentially “invisible,” 
because its tangent space is zero. We explain “invisibility” in Chapter 5, 
after looking at the tangent spaces of some particular groups in Chapter 4. 

It should be mentioned, however, that the quotient of a matrix group 
by a normal subgroup is not necessarily a matrix group. Thus in taking 
quotients we may leave the world of matrix groups. The first example was 
discovered by Birkhoff [1936]. It is the quotient (called the Heisenberg 
group) of the group of upper triangular matrices of the form 


1 
0 
0 


om & 


y 
z}|, where x,y,zER, 
1 
by the subgroup of matrices of the form 
1 0 
O 1 O]}, where neEZ. 
0 0 1 


The Heisenberg group is a Lie group, but not isomorphic to a matrix group. 

One of the reasons for looking at tangent spaces is that we do not have 
to leave the world of matrices. A theorem of Ado from 1936 shows that 
the tangent space of any Lie group G—the Lie algebra g—can be faithfully 
represented by a space of matrices. And if G is almost simple then g is truly 
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simple, in a sense that will be explained in Chapter 6. Thus the study of 
simplicity is, well, simplified by passing from Lie groups to Lie algebras. 

The importance of topology in Lie theory—and particularly paths and 
connectedness—was first realized by Schreier in 1925. Schreier published 
his results in the journal of the Hamburg mathematical seminar—a well- 
known journal for algebra and topology at the time—but they were not 
noticed by Lie theorists until after Schreier’s untimely death in 1929 at the 
age of 28. In 1929, Elie Cartan became aware of Schreier’s results and 
picked up the torch of topology in Lie theory. 

In the 1930s, Cartan proved several remarkable results on the topol- 
ogy of Lie groups. One of them has the consequence that S! and S? are 
the only spheres that admit a continuous group structure. Thus the Lie 
groups SO(2) and SU(2), which we already know to be spheres, are the 
only spheres that actually occur among Lie groups. Cartan’s proof uses 
quite sophisticated topology, but his result is related to the theorem of 
Frobenius mentioned in Section 1.6, that the only skew fields R” are R, 
R*? =C, and R*=H. In particular, there is a continuous and associa- 
tive “multiplication’—necessary for continuous group structure—only in 
R, R?, and R*. For more on the interplay between topology and algebra in 
IR", see the book Ebbinghaus et al. [1990]. 


4 


The exponential map 


PREVIEW 


The group S' = SO(2) studied in Chapter 1 can be viewed as the image of 
the line Ri = {10 : 8 € R} under the exponential function, because 


@ 


exp(i0) = e® = cos @ + isin@. 


This line is (in a sense we explain below) the tangent to the circle at its 
identity element 1. And, in fact, any Lie group has a linear space (of the 
same dimension as the group) as its tangent space at the identity. 

The group S* = SU(2) is also the image, under a generalized exp func- 
tion, of a linear space. This linear space—the tangent space of SU(2) at 
the identity—is three-dimensional and has an interesting algebraic struc- 
ture. Its points can be added (as vectors) and also multiplied in a way that 
reflects the nontrivial conjugation operation g1,g2 '> 21 22). in SU(2). 
The algebra su(2) on the tangent space is called the Lie algebra of the Lie 
group SU(2), and it is none other than IR? with the vector product. 

As we know from Chapter 1, complex numbers and quaternions can 
both be viewed as matrices. The exponential function exp generalizes to 
arbitrary square matrices, and we will see later that it maps the tangent 
space of any matrix Lie group G into G. In many cases exp is onto G, and in 
all cases the algebraic structure of G has a parallel structure on the tangent 
space, called the Lie algebra of G. In particular, the conjugation operation 
on G, which reflects the departure of G from commutativity, corresponds 
to an operation on the tangent space called the Lie bracket. 

We illustrate the exp function on matrices with the simplest nontrivial 
example, the affine group of the line. 
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4.1 The exponential map onto SO(2) 


The relationship between the exponential function and the circle, 


e® — cos@+isin@, 
was discovered by Euler in his book Introduction to the Analysis of the 
Infinite of 1748. One way to see why this relationship holds is to look at 
the Taylor series for e*, cosx, and sinx, and to suppose that the exponential 
series is also meaningful for complex numbers. 


2 3 + 5 


l Xx Xx x 
e=14+5 Ta a 
2 x4 
cosx = | ata ; 
. xr ~© 
a a a 5! 


The series for e* is absolutely convergent, so we may substitute i for x and 
rearrange terms. This gives a definition of e'® and justifies the following 
calculation: 


e io 0? i 8° 10? 
ro stat Ss 


ee aes fo FO eee neaid 
= or ae +1 Tartar =cos@+ isin@. 


Thus the exponential function maps the imaginary axis Ri of points i@ onto 
the circle S! of points cos 9 + isin @ in the plane of complex numbers. 

The operations of addition and negation on Ri carry over to multipli- 
cation and inversion on S!, since 


ae . \—1 
291 pis — pi(81+) and (c”) = ¢-ie 


There is not much more to say about S!, because multiplication of 
complex numbers is a well-known operation and the circle is a well-known 
curve. However, we draw attention to one trifling fact, because it proves to 
have a more interesting analogue in the case of S? that we study in the next 
section. The line of points i@ mapped onto S! by the exponential function 
can be viewed as the tangent to S! at the identity element 1 (Figure 4.1). Of 
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course, the points on the tangent are of the form 1 + i0, but we ignore their 
constant real part 1. The essential coordinate of a point on the tangent is its 
imaginary part i6, giving its height @ above the x-axis. Note also that the 
point i@ at height @ is mapped to the point cos 6 + isin @ at arc length @. 
Thus the exponential map preserves the length of sufficiently small arcs. 


Figure 4.1: S! and its tangent at the identity. 


Euler’s discovery that the exponential function can be extended to the 
complex numbers, and that it can thereby map a straight line onto a curve, 
was just the beginning. In the next section we will see that a further exten- 
sion of the exponential function can map the flat three-dimensional space 
IR? onto a curved one, S°, and in the next chapter we will see that such 
exponential mappings exist in arbitrarily high dimensions. 


Exercises 


The fundamental property of the exponential function is the addition formula, 
which tells us that exp maps sums to products, that is, 


eA tB _ AB. 


However, we are about to generalize the exponential function to objects that do 
not enjoy all the algebraic properties of real or complex numbers, so it is important 
to investigate whether the equation e4+# = e4e® still holds. The answer is that it 
does, provided AB = BA. 
We assume that 
xX xX? 


e =1+ T + eT +--+, where 1 is the identity object. 
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4.1.1 Assuming that AB = BA, show that 
(A+B)" =A™+ (‘Tanta (Santa ( m |)4B" lB 
m— 


where (7) denotes the number of ways of choosing / things from a set of 
m things. 


4.1.2 Show that (7!) = “Ue we = 


4.1.3 Deduce from Exercises 4.1.1 and 4.1.2 that the coefficient of A”~!B?! in 


A+B (A+B)? (A+B) 
Ae gp STE ATR eee 
1! 2! 3! 


is 1/1!(m—1)! when AB = BA. 
4.1.4 Show that the coefficient of A”~'B! in 


A A? A BB? B3 
14+—+—4—}4.-. La an oe 


is also 1 /1!(m—1)!, and hence that e4*+8 = e4e® when AB = BA. 


4.2 The exponential map onto SU(2) 


If u = bi+- cj +dk is a unit vector in Ri+ Rj+ Rk, then u* = —1 by the 
argument at the end of Section 1.4. This leads to the following elegant 
extension of the exponential map from pure imaginary numbers to pure 
imaginary quaternions. 


Exponentiation theorem for H. When we write an arbitrary element of 
Ri+ Rj-+ Rk in the form @u, where u is a unit vector, we have 


e*" — cos @ +usin@ 


and the exponential function maps Ri+ Rj + Rk onto S* = SU(2). 


Proof. For any pure imaginary quaternion v we define e” by the usual 


infinite series 


vo yp 


ev = =1+5 it ay 1) 3y 31 + 
This series is absolutely convergent in H for the same reason as in C: for 
sufficiently large n, |v|"/n! <2~". Thus e” is meaningful for any pure 
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imaginary quaternion v. If v= Ou, where wu is a pure imaginary and |u| = 1, 
then u* = —1 by the remark above, and we get 


Qu 8 Ou OF Ou Oo _ 


1! 2! 3! a! SI 6! 
ai 62 94 6 @ @ 
=cos@+usin@. 


Also, a point a+ bi+ cj+dk € S? can be written in the form 
bi+cj+dk 
peeing VP+e4+@=at+uvy b?+c?+¢?’, 
Vb? +0? +d? 
where u is a unit pure imaginary quaternion. Since a*+b?+c*+d?=1 
for a quaternion a+ bi+ cj +dk € S°, there is a real @ such that 


a=cos6, V b2+c?+d?2 =sin6. 


Thus any point in S? is of the form cos @ + usin@, and so the exponential 
map is from Ri+ Rj+ Rk onto S°. 


Up to this point, we have a beautiful analogy with the exponential map 
in C. The three-dimensional space Ri+ Rj-+ Rk is the tangent space of the 
3-sphere S? = SU(2) at the identity element 1, as we will see in the next 
section. 

But the algebraic situation on S° is more interesting (if you like, more 
complex) than on S!. For a pair of elements u,v € S* we generally have 
uv # vu, and hence uvu~! 4 y. Thus the element uvu~', the conjugate of v 
by u!, detects failure to commute. Remarkably, the conjugation operation 
on S? = SU(2) is reflected in a noncommutative operation on the tangent 
space IRi+ Rj + Rk that we uncover in the next section. 


Exercises 


4.2.1 Show that the exponential function maps any line through O in Ri+ Rj+ Rk 
onto a circle of radius 1 in S*. 


Since we can have uv # vu for quaternions u and v, it can be expected, from the 

previous exercise set, that we can have e“e” 4 e"*”. 

4.2.2. Explain why i= e!”/? and j = ei*/2, 

4.2.3 Deduce from Exercise 4.2.2 that at least one of e!”/2¢i7/2, eit/2¢it/2 is not 
equal to e!7/2+57/2. 
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4.3 The tangent space of SU(2) 


The space Ri+ Rj+ Rk mapped onto SU(2) by the exponential function 
is the tangent space at 1 of SU(2), just as the line Ri is the tangent line 
at 1 of the circle SO(2). But SU(2), unlike SO(2), cannot be viewed from 
“outside” by humans, so we need a method for finding tangent vectors from 
“inside” SU(2). This method will later be used for the higher-dimensional 
groups SO(n), SU(n), and so on. 

The idea is to view a tangent vector at 1 as the “velocity vector” of 
a smoothly moving point as it passes through 1. To be precise, consider 
a differentiable function of t, whose values q(t) are unit quaternions, and 
suppose that g(0) = 1. Then the “velocity” g'(0) at t = 0 is a tangent vector 
to SU(2), and all the tangent vectors to SU(2) at 1 are obtained in this way. 

The assumption that q(t) is a unit quaternion for each ¢ in the domain 
of gq means that 


q(t)q(t) =1, (*) 
because gg = |q|* for each quaternion g, as we saw in Section 1.3. By 
differentiating (*), using the product rule, we find that 


q (t)q(t) +4(t)q'(t) =0. 


(The usual proof of the product rule applies, even though quaternions do 
not necessarily commute—it is a good exercise to check why this is so.) 
Then setting t = 0, and bearing in mind that q(0) = 1, we obtain 


q (0) +q/(0) =0. 


So, every tangent vector q'(0) to SU(2) satisfies 


(0) +q'(0) =0, 


which means that q‘(0) is a pure imaginary quaternion p. Conversely, if 
p is any pure imaginary quaternion, then pt € Ri+ Rj-+ Rk for any real 
number f, and we know from the previous section that e”’ € SU(2). Thus 
q(t) =e” is a path in SU(2). This path passes through 1 when ¢ = 0, and 
it is smooth because it has the derivative 


q(t) = pe”. 


(To see why, differentiate the infinite series for e”’.) Finally, g/(0) = p, 
because e? = 1. Thus every pure imaginary quaternion is a tangent vector 
to SU(2) at 1, and so the tangent space of SU(2) at 1 is Ri+ Rj+ Rk. 
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This construction of the tangent space to SU(2) at 1 provides a model 
that we will follow for the so-called classical Lie groups in Chapter 5. In all 
cases it is easy to find the general form of a tangent vector by differentiating 
the defining equation of the group, but one needs the exponential function 
(for matrices) to confirm that each matrix X of the form in question is in 
fact a tangent vector (namely, the tangent to the smooth path e’*), 


The Lie bracket 


The great idea of Sophus Lie was to look at elements “infinitesimally close 
to the identity” in a Lie group, and to use them to infer behavior of ordi- 
nary elements. The modern version of Lie’s idea is to infer properties of 
the Lie group from properties of its tangent space. A commutative group 
operation, as on SO(2), is completely captured by the sum operation on the 
tangent space, because e**” = e*e”. The real secret of the tangent space is 
an extra structure called the Lie bracket operation, which reflects the non- 
commutative content of the group operation. (For a commutative group, 
such as SO(2), the Lie bracket on the tangent space is always zero.) 

In the case of SU(2) we can already see that the sum operation on 
Ri+ Rj+ Rk is commutative, so it cannot adequately reflect the product 
operation on SU(2). Nor can the product on SU(2) be captured by the 
quaternion product on Ri+ Rj+ Rk, because the quaternion product is not 
always defined on Ri+ Rj+ Rk. For example, i belongs to Ri+ Rj + Rk 
but the product i? does not. What we find is that the noncommutative 
content of the product on SU(2) is captured by the Lie bracket of pure 
imaginary quaternions U, V defined by 


[U,V] =UV —VU. 


This comes about as follows. Suppose that u(s) and v(t) are two smooth 
paths through 1 in SU(2), with u(0) = v(0) = 1. For each fixed s we con- 
sider the path 


w(t) = u(s)v(t)u(s)7t. 
This path also passes through 1, and its tangent there is 
w,(0) = u(s)v’(O)u(s)~! = u(s)Vu(s)~', 


where V = v’(0) is the tangent vector to v(t) at 1. Now w/(0) is a tangent 
vector at 1 for each s, so (letting s vary) 


x(s) =u(s)Vu(s)—! 
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is a smooth path in Ri+Rj+Rk. The tangent x’ (0) to this path at s = 0 is 
also an element of Ri+Rj-+ Rk, because x’(0) is the limit of differences 
between elements of Ri+ Rj-+ Rk, and Ri+ Rj+ Rk is closed under dif- 
ferences and limits. By the product rule for differentiation, and because 
u(0) = 1, the tangent vector x'(0) is 


d 
ds s=0 


u(s)Vu(s)~' = u'(0)Vu(0)~' +u(0)V (—w'(0)) 
=UV—VU, 


where U = u'(0) is the tangent vector to u(s) at 1. 

It follows that ifU,V € Ri+Rj-+ Rk then [U,V] € Ri+ Rj+ Rk. It is 
possible to give a direct algebraic proof of this fact (see exercises). But the 
proof above shows the connection between the conjugate of v(t) by u(s)~! 
and the Lie bracket of their tangent vectors, and it generalizes to a proof 
that U,V € T%(G) implies [U,V] € 7%)(G) for any matrix Lie group G. In 
fact, we revisit this proof in Section 5.4. 


Exercises 


The definition of derivative for any function c(t) of a real variable f is 


t+At) — c(t 
c(t) = lim aa eee 
At—0 At 


4.3.1 By imitating the usual proof of the product rule, show that if c(t) = a(t)b(t) 
then 
c(t) =a’ (t)b(t) +a(t)b'(t). 
(Do not assume that the product operation is commutative.) 
4.3.2. Show also that if c(t) = a(t)~', and a(0) = 1, then c’(0) = —a’(0), again 
without assuming that the product is commutative. 


4.3.3 Show, however, that if c(t) = a(t)” then c’(t) is not equal to 2a(t)a'(t) for a 
certain quaternion-valued function a(t). 


To investigate the Lie bracket operation on Ri+ Rj-+ Rk, it helps to know what 
it has in common with more familiar product operations, namely bilinearity: for 
any real numbers a, and ap, 


[a1U) +.a2U2,V] =a1[U1,V]+a2[U2,V],  [U,aiVi +a2V2] =ai[U, Vi] +a2|U, Vo]. 
4.3.4 Deduce the bilinearity property from the definition of [U,V]. 


4.3.5 Using bilinearity, or otherwise, show that U,V € Ri+ Rj + Rk implies 
[U,V] € Ri+ Rj+Rk. 
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4.4 The Lie algebra su(2) of SU(2) 


The tangent space Ri+ Rj+ Rk of SU(2) is a real vector space, or a vector 
space over R. That is, it is closed under the vector sum operation, and also 
under multiplication by real numbers. The additional structure provided by 
the Lie bracket operation makes it what we call su(2), the Lie algebra of 
SU(2).? In general, a Lie algebra is a vector space with a bilinear operation 
[, | satisfying 


[X,Y] +[¥,X] 
IX, [¥,Z]] + [¥,[Z,X] + [Z, [X,Y] 


= 0, 
=0 


These algebraic properties look like poor relations of the commutative and 
associative laws, and no doubt they seem rather alien at first. Nevertheless, 
they are easily seen to be satisfied by the Lie bracket [U,V] =UV —VU on 
Ri-+ Rj-+ Rk and, more generally, on any vector space of matrices closed 
under the operation U,V ++ UV — VU (see exercises). In the next chapter 
we will see that the tangent space of any so-called classical group is a Lie 
algebra for much the same reason that su(2) is. 

What makes su(2) particularly interesting is that it is probably the only 
nontrivial Lie algebra that anyone meets before studying Lie theory. Its 
Lie bracket is not as alien as it looks, being essentially the cross product 
operation on R? that one meets in vector algebra. 

To see why, consider the Lie brackets of the basis vectors i, j, and k of 
Ri-+ Rj-+ Rk, which are 


(i,j) = ij —ji=k+k =2k, 
i,k] = jk —kj =i+i=2i, 
(k, i] =ki—ik =j+j=2). 


Then, if we introduce the new basis vectors 
i=12.. 7212, K=k?. 


we get 
li’. = k’, j,k] = i [k’,i’] =i 


It is traditional to denote the Lie algebra of a Lie group by the corresponding lower case 
Fraktur (also called German or Gothic) letter. Thus the Lie algebra of G will be denoted by 
g, the Lie algebra of SU(n) by su(n), and so on. 
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The latter equations are precisely the same as those defining the cross prod- 
uct on the usual basis vectors. 

This probably makes it clear that the cross product on R? is “the same” 
as the Lie bracket on Ri-+ Rj-+ Rk, but we can spell out precisely why 
by setting up a 1-to-1 correspondence between Ri-+ Rj-+ Rk and R?> that 
preserves the vector sum and scalar multiples (the vector space operations), 
while sending the Lie bracket to the cross product. 

The map 9 : bi+cj+dk+> (2b, 2c,2d) is a 1-to-1 correspondence that 
preserves the vector space operations, and it also sends i’, j’, k’ and their 
Lie brackets to i, j, k and their cross products, respectively. It follows that 
@ sends all Lie brackets to the corresponding cross products, because the 
Lie bracket of arbitrary vectors, like the cross product of arbitrary vectors, 
is determined by its values on the basis vectors (by bilinearity). 


Exercises 


The second property of the Lie bracket is known as the Jacobi identity, and all 
beginners in Lie theory are asked to check that it follows from the definition 
[X,Y] =XY-—YX. 

4.4.1 Prove the Jacobi identity by using the definition [X,Y] = XY — YX to ex- 
pand [X,|Y,Z]] + [Y,[Z,X]] +[Z,[X,Y]]. Assume only that the product is 
associative and that the usual laws for plus and minus apply. 

4.4.2 Using known properties of the cross product, or otherwise, show that the 
Lie bracket operation on su(2) is not associative. 


In the words of Kaplansky [1963], p. 123, 


... the commutative and associative laws, so sadly lacking in the Lie 
algebra itself, are acquired under the mantle of f. 


By f he means a certain inner product, called the Killing form. A special case of 
it is the ordinary inner product on R°, for which we certainly have commutativity: 
u-v=v-u. “Associativity under the mantle of the inner product” means 
(uXv):w=u-(vxw). 
4.4.3 Show that if 
u=ujitujtu3k, v=vitvjtwyk, w=wyit+twejtw3k, 


then 
uy U2 U3 
u-(VvxXw)=] Vy V2 V3 
Wi W2 W3 


4.4.4 Deduce from Exercise 4.4.3 that (ux v)-w=u-(vxw). 
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4.5 The exponential of a square matrix 


We define the matrix absolute value of A = (aj;) to be 


|A|= , / >i lal. 
Va 


For an n xn real matrix A the absolute value |A| is the distance from the 
origin O in R” of the point 


(Ai Cia ch A OT AI hag iy on Audie ye) 


If A has complex entries, and if we interpret each copy of C as R? (as in 
Section 3.3), then |A| is the distance from O of the corresponding point in 
RR, Similarly, if A has quaternion entries, then |A| is the distance from O 
of the corresponding point in Rw”, 

In all cases, |A — B| is the distance between the matrices A and B, and 
we say that a sequence A,,A2,A3,... of n x n matrices has limit A if, for 
each € > 0, there is an integer M such that 


m>M => |An—Al <é. 


The key property of the matrix absolute value is the following inequal- 
ity, a consequence of the triangle inequality (which holds in the plane and 
hence in any R*) and the Cauchy—Schwarz inequality. 


Submultiplicative property. For any two real n xn matrices A and B, 
|AB| < |A||BI. 


Proof. If A = (a;;) and B = (b;;), then it follows from the definition of 
matrix product that 


|(i, j)-entry of AB] = |aj1b1j + aigboj +--+ + dinbnj| 
< |aib1;| + |aizbzj| +--+ + |aindnj| 
by the triangle inequality 
= aa ||b1j| + |ai2||b2j| +--+ + lain||Bnj| 
by the multiplicative property of absolute value 


S yf lai? +--+ lain|? y/ ag? ++ + [Bn,l? 


by the Cauchy—Schwarz inequality. 
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Now, summing the squares of both sides, we get 


|AB|* = }°|(i, j)-entry of AB|? 
i,j 
< Y (lai |? + +++ + |ain|*) (\bij/? +--+ + [bnjl*) 
ij 
=P (laa|? +--+ lain?) S (lari? +--+ + lbngl?) 
i J 


= |A|*|B\?, as required 


It follows from the submultiplicative property that |A’””| < |A|’". Along 
with the triangle inequality |A+ B| < |A|+ |B], the submultiplicative prop- 
erty enables us to test convergence of matrix infinite series by comparing 
them with series of real numbers. In particular, we have: 


Convergence of the exponential series. [fA is any n x n real matrix, then 


A A* A 
1 ray [Por acts 31 +:++, where 1=nxn identity matrix, 


2 
is convergent in R” . 


Proof. It suffices to prove that this series is absolutely convergent, that is, 
to prove the convergence of 


a+ Al | + dal HN dean 
2! 3! 
This is a series of positive real numbers, whose terms (except for the first) 
are less than or equal to the corresponding terms of 


by the submultiplicative property. The latter series is the series for the real 
exponential function e!4|; hence the original series is convergent. 


Thus it is meaningful to make the following definition, valid for real, 
complex, or quaternion matrices. 
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Definition. The exponential of any n x n matrix A is given by the series 


A A* A 
ee ata oe 


The matrix exponential function is a generalization of the complex and 
quaternion exponential functions. We already know that each complex 
number z = a+ bi can be represented by the 2 x 2 real matrix 


a —b 
a-(5 a) 
and it is easy to check that e* is represented by e“. We defined the quater- 
nion g =a+bi+cj-+dk to be the 2 x 2 complex matrix 


p= a+di —b+ci 
~ \b+ci a-di }’ 
so the exponential of a quaternion matrix may be represented by the expo- 
nential of a complex matrix. 


From now on we will often denote the exponential function simply by 
exp, regardless of the type of objects being exponentiated. 


Exercises 


The version of the Cauchy—Schwarz inequality used to prove the submultiplicative 
property is the real inner product inequality |u-v| < |u||v|, where 


u=(|ai|,|az|,---,|@in|) and v= (|bji|,|bj2|,---,|Bjnl) - 


It is probably a good idea for me to review this form of Cauchy—Schwarz, since 
some readers may not have seen it. 
The proof depends on the fact that w- w = |w|* > 0 for any real vector w. 


4.5.1 Show that 0 < (u4+ xv) -(u+xv) = |u|? + 2(u-v)x+.2?|v|* = q(x), for any 
real vectors u, v and real number x. 


4.5.2 Use the positivity of the quadratic function g(x) found in Exercise 4.5.1 to 
deduce that 
(2u-v)* —4|u\?|v? <0, 


that is, |u-v| < |u||vI. 
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Matrix exponentiation gives another proof that e’? = cos @ +isin@, since we 
can interpret 7@ as a 2 x 2 real matrix A. 


4.5.3 Show, directly from the definition of matrix exponentiation, that 
0 -8@ cos@ —sin@ 
A=(5 0) ~ A= (oe a) 


The exponential of an arbitrary matrix is hard to compute in general, but easy 
when the matrix is diagonal, or diagonalizable. 


4.5.4 Suppose that D is a diagonal matrix with diagonal entries A), A2,..., Ax. By 
computing the powers D” show that e? is a diagonal matrix with diagonal 
entries e41 ec, Bs en, 


4.5.5 If A is a matrix of the form BCB!, show that e4 = Be©B!. 


4.5.6 By term-by-term differentiation, or otherwise, show that oes = Ae’ for 
any square matrix A. 


4.6 The affine group of the line 
Transformations of R of the form 
fap(x) =ax+b, where a,bE€R and a>0O, 


are called affine transformations. They form a group because the product of 
any two such transformations is another of the same form, and the inverse 
of any such transformation of another of the same form. We call this group 
Aff(1), and we can view it as a matrix group. The function f, » corresponds 
to the matrix 


Fap= é : , applied on the left to (7) : 


6 )0)-C) 


Thus Aff(1) can be viewed as a group of 2 x 2 real matrices, and hence 
it is a geometric object in R*. On the other hand, Aff(1) is intrinsically 
two-dimensional, because its elements form a half-plane. To see why, 
consider first the two-dimensional subspace of R* consisting of the points 
(a,b,0,0). This is a plane, and hence so is the set of points (a,b,0,1) 


because 
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obtained by translating it by distance 1 in the direction of the fourth coor- 
dinate. Finally, we get half of this plane by restricting the first coordinate 
toa>0. 

Aff(1) is closed under nonsingular limits; hence it is a two-dimensional 
matrix Lie group, like the real vector space IR? under vector addition, and 
the torus S! x S!. Unlike these two matrix Lie groups, however, Aff(1) is 
not abelian. For example, 


fa fia(x) = 1(2x4+1) +2 = 2x+3, 


whereas 
fi2foa(x) =2(1x+2)+1=2x+5. 


Aff(1) is in fact the only connected, nonabelian two-dimensional Lie group. 
This makes it interesting, yet still amenable to computation. As we will 
see, it is easy to compute its tangent vectors, and to exponentiate them, 
from first principles. But first note that there are two ways in which Aff(1) 
differs from the Lie groups studied in previous chapters. 


e Asa geometric object, Aff(1) is an unbounded subset of R* (because 
b can be arbitrary and a is an arbitrary positive number). We say that 
it is a noncompact Lie group, whereas SO(2), SO(3), and SU(2) 
are compact. In Chapter 8 we give a more precise discussion of 
compactness. 


e Asa group, it admits an c-to-1 homomorphism onto another infinite 
group. The homomorphism @¢ in question is 


_(a b _ (4 0 
eG: 4 0 41) 
This sends the infinitely many matrices F;, », as b varies, to the matrix 
Fa, and it is easily checked that 


(Fa, ,b: Fan,b2) = @ (Fay,b, )@(Fa,br)- 


It follows, in particular, that Aff(1) is not a simple group. Also, the nor- 
mal subgroup of matrices in the kernel of @ is itself a matrix Lie group. 
The kernel consists of all the matrices that @ sends to the identity matrix, 
namely, the group of matrices of the form 


1b 
(( ') for DER. 
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Geometrically, this subgroup is a line, and the group operation corresponds 
to addition on the line, because 


1 bh 1 bo\ (1 b+b 
O 17\0 TJ O ff J 


The Lie algebra of Aff(1) 


Since Aff(1) is half of a plane in the space R* of 2 x 2 matrices, it is 
geometrically clear that its tangent space at the identity element is a plane. 
However, to find explicit matrices for the elements of the tangent space 
we look at the vectors from the identity element ie °) of Aff(1) to nearby 
points of Aff(1). 

These are the vectors 


l+a Bp CO}. ee Py. 1 0 0 1 
( 0 )- (6 =(5 >) =#(g 5) +B (9 s) 
for small values of ~ and B. Normally, one needs to find the limiting 


directions of these vectors (the “tangent vectors”) as a, B — 0, but in this 
case all such directions lie in the plane spanned by the vectors 


1 0 0 1 
1=(5 0), (6 0) 
The Lie bracket [u,v] = uv — vu on this two-dimensional space is deter- 
mined by the Lie bracket of the basis vectors: 


J.K] =K. 


The exponential function maps the tangent space 1-to-1 onto Aff(1), as 
one sees from some easy calculations with a general matrix G B) in the 


tangent space. First, induction shows that 


a B n _ q Bar! 
0 OF “40 0 ‘ 
or, in terms of J and K, 


(aJ + BK)" = a"J+ Ba" 'K. 
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Then substituting these powers in the exponential series (and writing 1 for 
the identity matrix) gives 


eud+ BK 
1 1 1 
ee ogge ene gt ae 
1 
Se * (aS + BK) + 5 (07I + BaK) + + = (aS + Bor 'K) + 


2 n n—-1 
a a a a 
ro Jae (4 +54 -+— io) K 


1 B)\) , _ 
or i, ') if a=O0. 


The former matrix equals (3 ae where a > 0, for a unique choice of a 
and $B. First choose & so that a = e®; then choose B so that 


p=F (e*-1) o b=p if o=0. 


Exercises 


Exponentiation of matrices does not have all the properties of ordinary exponen- 
tiation, because matrices do not generally commute. However, exponentiation 
works normally on matrices that do commute, such as powers of a fixed matrix. 
Here is an example in Aff(1). 


4.6.1 Work out ¢; ' y and ( ' yp, and then prove by induction that 


a b\" a” bt} 
— a 
é ') 7 (4 1 ) , 


4.6.2 Use the formula in Exercise 4.6.1 to work out the nth power of the matrix 
e%S+BK and compare it with the matrix eI+BK obtained by exponenti- 
ating naJ +nBK. 


‘ n pall : z x 
4.6.3 Show that the matrices G a ) , forn = 1,2,3,..., lie on a line in R‘. 


Also show that the line passes through the point ie ‘es 
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4.7 Discussion 


The first to extend the exponential function to noncommuting objects was 
Hamilton, who applied it to quaternions almost as soon as he discovered 
them in 1843. In the paper Hamilton [1967], a writeup of an address to the 
Royal Irish Academy on November 13, 1843, he defines the exponential 
function for a quaternion g on p. 207, 


2 3 
q 4 q 
ae oe ee er Sere 
e trtatat ‘ 


and observes immediately that 


q'+4 


elel=e when gq’ =q’q. 


On p. 225 he evaluates the exponential of a pure imaginary quaternion, 
stating essentially the result of Section 4.2, that 


e*" —cos@+usin@ when |u| =1. 

The exponential map was extended to Lie groups in general by Lie 
in 1888. From his point of view, exponentiation sends “infinitesimal” el- 
ements of a continuous group to “finite” elements (see Hawkins [2000], 
p. 82). A few mathematicians in the late nineteenth century briefly noted 
that exponentiation makes sense for matrices, but the theory of matrix ex- 
ponentiation did not flourish until Wedderburn [1925] proved the submulti- 
plicative property of the matrix absolute value that guarantees convergence 
of the exponential series for matrices. The trailblazing investigation of von 
Neumann [1929] takes Wedderburn’s result as its starting point. 

The matrix exponential function has many properties in common with 
the ordinary exponential, such as 


xX n 
e* = lim (1+>) . 
n—-eoo n 


We do not need this property in this book, but it nicely illustrates the idea 
of Lie (and, before him, Jordan [1869]), that the “finite” elements of a 
continuous group may be “generated” by its “infinitesimal” elements. If X 
is a tangent vector at 1 to a group G and n is “infinitely large,’ then 1+ x 
is an “infinitesimal” element of G. By iterating this element n times we 
obtain the “finite” element e* of G. 
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It was discovered by Lie’s colleague Engel in 1890 that, in the group 
SL(2,C) of 2 x 2 complex matrices with determinant 1, not every element 
is an exponential. In particular, the matrix (a Ee is not the exponen- 
tial of any matrix tangent to SL(2,C) at 1; hence it is not “generated by 
an infinitesimal element” of SL(2,C). (We indicate a proof in the exer- 
cises to Section 5.6.) The result was considered paradoxical at the time 
(see Hawkins [2000], p. 86), and its mystery was dispelled only when the 
global properties of Lie groups became better understood. In the 1920s it 
was realized that the topology of a Lie group is the key to its global behav- 
ior. For example, the paradoxical behavior of SL(2,C) can be attributed 
to its noncompactness, because it can be shown that every element of a 
connected, compact Lie group is the exponential of a tangent vector. We 
do not prove this theorem about exponentiation in this book, but we will 
discuss compactness and connectedness further in Chapter 8. 

For a noncompact, but connected, group G the next best thing to sur- 
jectivity of exp is the following: every g € G is the product e*'e*2 -- - eX« of 
exponentials of finitely many tangent vectors X1,X2,...,X,. This result is 
due to von Neumann [1929], and we give a proof in Section 8.6. 

For readers acquainted with differential geometry, it should be men- 
tioned that the exponential function can be generalized even beyond matrix 
groups, to Riemannian manifolds. In this setting, the exponential function 
maps the tangent space Tp(M) at point P on a Riemannian manifold M 
into M by mapping lines through O in Tp(M) isometrically onto geodesics 
of M through P. The Riemannian manifolds S! = {z € C: |z| = 1} and 
S? = {q €H: |q| = 1}, and their tangent spaces R and R°, nicely illustrate 
the geodesic aspect of exponentiation. The exponential map sends straight 
lines through O in the tangent space isometrically to geodesic circles in 
the manifolds (to S! itself in C, and to the unit circles cos @ + usin@ in H, 
which are geodesic because they are the largest possible circles in S°). 


5 


The tangent space 


PREVIEW 


The miracle of Lie theory is that a curved object, a Lie group G, can be 
almost completely captured by a flat one, the tangent space T4(G) of G at 
the identity. The tangent space of G at the identity consists of the tangent 
vectors to smooth paths in G where they pass through 1. A path A(t) in G 
is called smooth if its derivative A’(t) exists, and if A(O) = 1 we call A’(0) 
the tangent or velocity vector of A(t) at 1. Tj(G) consists of the velocity 
vectors of all smooth paths through 1. 

It is quite easy to determine the form of the matrix A’(0) for a smooth 
path A(t) through 1 in any of the classical groups, that is, the generalized 
rotation groups of Chapter 3 and the general and special linear groups, 
GL(n,C) and SL(n,C), we will meet in Section 5.6. For example, any 
tangent vector of SO(n) at 1 is an n x n real skew-symmetric matrix—a 
matrix X such that X + X' = 0. The problem is to find smooth paths in the 
first place. It is here that the exponential function comes to our rescue. 

As we saw in Section 4.5, e* is defined for any n x n matrix X by the 
infinite series used to define e* for any real or complex number x. This ma- 
trix exponential function provides a smooth path with prescribed tangent 
vector at 1, namely the path A(t) = e*, for which A’(0) =X. In particular, 
it turns out that if X is skew-symmetric then e’* € SO(n) for any real t, so 
the potential tangent vectors to SO(n) are the actual tangent vectors. 

In this way we find that Tj(SO(n)) = {X € M,,(IR) :X +X? =0}, where 
M,,(R) is the space of n x n real matrices. The exponential function simi- 
larly enables us to find the tangent spaces of all the classical groups: O(n), 
SO(n), U(n), SU(n), Sp(n), GL(n, C), and SL(n, C). 
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5.1 Tangent vectors of O(n), U(n), Sp(n) 


Ina space S of matrices, a path is a continuous function t +> A(r) € S, where 
t belongs to some interval of real numbers, so the entries a;;(t) of A(t) are 
continuous functions of the real variable tr. The path is called smooth, or 
differentiable, if the functions aj;(t) are differentiable. 

For example, the function 


sint cost 


1B) = ( 


cost — “ 


is a smooth path in SO(2), while the function 


tH C(t) = ee ee 


sin|t| cos |t| 


is a path in SO(2) that is not smooth at t = 0. 
The derivative A'(t) of a smooth A(r) is defined in the usual way as 
A(t + At) — A(t 

oy Ale+at) Ale 


li ; 
At—0 At 


and one sees immediately that A’(r) is simply the matrix with entries a;,(t), 
where aj;(t) are the entries of A(t). Tangent vectors at 1 of a group G of 
matrices are matrices X of the form 

x=A'(0), 


where A(t) is a smooth path in G with A(0) = 1 (that is, a path “passing 

through 1 at time 0”). Tangent vectors can thus be viewed as “velocity 

vectors” of points moving smoothly through the point 1, as in Section 4.3. 
For example, in SO(2), 


A(t) = a Ot ce 


sin@t cos@t 


is a smooth path through 1 because A(0) = 1. And since 


17,\_. (—OsinOt —@cos Ot 
A= (Gesvor al 


the corresponding tangent vector is 
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In fact, all tangent vectors are of this form, so they form the 1-dimensional 


vector space of real multiples of the matrix i= Cr a ). This confirms what 


we already know geometrically: SO(2) is a circle and its tangent space at 
the identity is a line. 
We now find the form of tangent vectors for all the groups O(n), U(n), 


Sp(n) by differentiating the defining equation AA’ = 1 of their members 
A. (In the case of O(n), A is real, so A = A. In the cases of U(n) and Sp(n), 
A is the complex and quaternion conjugate, respectively.) 


Tangent vectors of O(n), U(n), Sp(n). The tangent vectors X at 1 are 
matrices of the following forms (where 0 denotes the zero matrix): 


(a) For O(n), nxn real matrices X such that X +X? =0. 
(b) For U(n), n x n complex matrices X such that X +X =0. 


(c) For Sp(n), n x n quaternion matrices X such that X +X'=0. 


Proof. (a) The matrices A € O(n) satisfy AA' = 1. Let A= A(t) be a 
smooth path originating at 1, and take d/dt of the equation 


A(t)A(t)? =1. 


The product rule holds as for ordinary functions, as does 41 = 0 because 


1 is a constant. Also, f(A‘) = (2ay by considering matrix entries. So 
we have 
A'(t)A(t)' +A()A’(t)' = 0. 


Since A(0) = 1 =A(0)!, for t = 0 this equation becomes 
A’(0) +4‘(0)' = 0. 


Thus any tangent vector X = A’(0) satisfies X +X? =0. 


(b) The matrices A € U(n) satisfy AA’ =1, Again let A = A(t) be a 
smooth path with A(0) = 1 and now take d/dt of the equation AA =1. By 


considering matrix entries we see that 4A(r) = A’(r). Then an argument 


like that in (a) shows that any tangent vector X satisfies X +X ) 


(c) For the matrices A € Sp(n) we similarly find that the tangent vectors 
X satisfy X+X' =0. 
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The matrices X satisfying X +X! = 0 are called skew-symmetric, be- 
cause the reflection of each entry in the diagonal is its negative. That is, 
Xji = —x;j. In particular, all the diagonal elements of a skew-symmetric 
matrix are 0. Matrices X satisfying X +X T — 0 are called skew-Hermitian. 
Their entries satisfy x;; = —X;7 and their diagonal elements are pure imag- 
inary. 

It turns out that all skew-symmetric n x n real matrices are tangent, 
not only to O(n), but also to SO(n) at 1. To prove this we use the matrix 
exponential function from Section 4.5, showing that eX € SO(n) for any 
skew-symmetric X, in which case X is tangent to the smooth path e* in 
SO(n). 


Exercises 
To appreciate why smooth paths are better than mere paths, consider the following 
example. 


5.1.1 Interpret the paths B(t) and C(t) above as paths on the unit circle, say for 
—n/2<t<n/2. 


5.1.2 If B(t) or C(t) is interpreted as the position of a point at time ¢, how does 
the motion described by B(t) differ from the motion described by C(t)? 


5.2 The tangent space of SO(n) 


In this section we return to the addition formula of the exponential function 
eB — AB when AB=BA, 


which was previously set as a series of exercises in Section 4.1. This for- 
mula can be proved by observing the nature of the calculation involved, 
without actually doing any calculation. The argument goes as follows. 
According to the definition of the exponential function, we want to 
prove that 
n 
( ad ae te) 


LS a eed ee 


A A" B B 
=/{14+—4+.:-:-4+—4... 1+—+.--.-+—4--- ]. 
1! n! 1! n! 


This could be done by expanding both sides and showing that the coeffi- 
cient of A/B” is the same on both sides. But if AB = BA the calculation 
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involved is the same as the calculation for real numbers A and B, in which 
case we know that e4+? = e4e® by elementary calculus. Therefore, the 
formula is correct for any commuting variables A and B. 

Now, the beauty of the matrices X and X' appearing in the condition 
X + XT = Ois that they commute! This is because, under this condition, 


XXT =xX(—X) = (—X)K = xX. 
Thus it follows from the above property of the exponential function that 
A at = Ma]. 


But also, eX = (eX)™ because (X1)” = (X”) and hence all terms in the 
exponential series get transposed. Therefore 


1=e% eX" = ee). 


In other words, if X + X' = 0 then eX is an orthogonal matrix. 
Moreover, e* has determinant 1, as can be seen by considering the path 
of matrices tX for 0 <t < 1. Fort =0, we have tX = 0, so 


&X — P= 1, which has determinant 1. 


And, as t varies from 0 to 1, e varies continuously from 1 to e*. This 
implies that the continuous function det(e*) remains constant, because 
det = +1 for orthogonal matrices, and a continuous function cannot take 
two (and only two) values. Thus we necessarily have det(e*) = 1, and 
therefore if X is ann x n real matrix with X +X" =0 then e* € SO(n). 

This allows us to complete our search for all the tangent vectors to 
SO(n) at 1. 


Tangent space of SO(n). The tangent space of SO(n) consists of precisely 
the nx n real vectors X such that X +X'=0. 


Proof. In the previous section we showed that all tangent vectors X to 
SO(n) at 1 satisfy X + X™ = 0. Conversely, we have just seen that, for any 
vector X with X + X' =0, the matrix e* is in SO(n). 

Now notice that X is the tangent vector at 1 for the path A(t) = e in 
SO(n). This holds because 


d ix 1X 
—e" =Xe”, 
dt 
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as in ordinary calculus. (This can be checked by differentiating the series 
for e'*.) It follows that A(t) has the tangent vector A’(0) = X at 1, and 
therefore each X such that X + X' = 0 occurs as a tangent vector to SO(n) 
at 1, as required. 


As mentioned in the previous section, a matrix X such that X +X? =0 
is called skew-symmetric. Important examples are the 3 x 3 skew- 
symmetric matrices, which have the form 


QO -x -y 
X=|{x O -z 
y z 0O 


Notice that sums and scalar multiples of these skew-symmetric matrices 
are again skew-symmetric, so the 3 x 3 skew-symmetric matrices form a 
vector space. This space has dimension 3, as we would expect, since it is 
the tangent space to the 3-dimensional space SO(3). Less obviously, the 
skew-symmetric matrices are closed under the Lie bracket operation 


[X1, Xo] = X 1X2 — XX. 


Later we will see that the tangent space of any Lie group G is a vector space 
closed under the Lie bracket, and that the Lie bracket reflects the conjugate 
ZI 228). of go by 2; € G. This is why the tangent space is so important 
in the investigation of Lie groups: it “linearizes” them without obliterating 
much of their structure. 


Exercises 


According to the theorem above, the tangent space of SO(3) consists of 3 x 3 real 
matrices X such that X = —X1. The following exercises study this space and the 
Lie bracket operation on it. 


5.2.1 Explain why each element of the tangent space of SO(3) has the form 


0 -x -y 
X= {x O —-z] =xI+ J+ 2K, 
y Z 0 
where 
0 -1l O 0 0 -!il 0 0 O 
I=/;1 0 OO}, J={|0 0 O], K={0 0 -!1I 
0 O O 1 0 O 0 1 =O 
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5.2.2 Deduce from Exercise 5.2.1 that the tangent space of SO(3) is a real vector 
space of dimension 3. 


5.2.3 Check that [I, J] = K, [J, K] = I, and [K,I] = J. (This shows, among other 
things, that the 3 x 3 real skew-symmetric matrices are closed under the 
Lie bracket operation.) 


5.2.4 Deduce from Exercises 5.2.2 and 5.2.3 that the tangent space of SO(3) un- 
der the Lie bracket is isomorphic to R* under the cross product operation. 


5.2.5 Prove directly that the n x n skew-symmetric matrices are closed under the 
Lie bracket, using X'’=~—X andy! =-Y. 


The argument above shows that exponentiation sends each skew-symmetric 
X to an orthogonal e*, but it is not clear that each orthogonal matrix is obtainable 
in this way. Here is an argument for the case n = 3. 
0 -@ 0 
5.2.6 Find the exponential of the matrixB=|@ 0 0 
0 0 O 


5.2.7 Show that Ae2AT = e484" for any orthogonal matrix A. 


5.2.8 Deduce from Exercises 5.2.6 and 5.2.7 that each matrix in SO(3) equals e* 
for some skew-symmetric X. 


5.3. The tangent space of U(n), SU(n), Sp(n) 


We know from Sections 3.3 and 3.4 that U(n) and Sp(), respectively, are 
the groups of n x n complex and quaternion matrices A satisfying AA’ =1. 
This equation enables us to find their tangent spaces by essentially the same 
steps we used to find the tangent space of SO(n) in the last two sections. 
The outcome is also the same, except that, instead of skew-symmetric ma- 
trices, we get skew-Hermitian matrices. As we saw in Section 5.1, these 
matrices X satisfy X +X =0. 


Tangent space of U(n) and Sp(n). The tangent space of U(n) consists of 
all the n x n complex matrices satisfying X +X T 0. The tangent space 
of Sp(n) consists of all n x n quaternion matrices X satisfying X +X = 0, 
where X denotes the quaternion conjugate of X. 


Proof. From Section 5.1 we know that the tangent vectors at 1 to a space 
of matrices satisfying AA’ = Lare matrices X satisfying X + X T=0. 
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Conversely, suppose that X is any n x n complex (respectively, quater- 
nion) matrix such that X +X = 0. It follows that 


i ao¥ 


and therefore 

Xe =X(-)= (ir =. 
This implies, by the addition formula for the exponential function for com- 

muting matrices, that 
1a a OX A KE 

. eof yr yr . . 

It is also clear from the definition of e* that e* = (e*)". So ifX is any 
n x ncomplex (respectively, quaternion) matrix satisfying X +X T _0 then 
e* is in U(n) (respectively, Sp(n)). It follows in turn that any such X is a 
tangent vector at 1. Namely, X = A’(0) for the smooth path A(t) = e™. 


In Section 5.1 we found the form of tangent vectors to O(n) at 1, but 
in Section 5.2 we were able to show that all vectors of this form are in 
fact tangent to SO(n), so we actually had the tangent space to SO(n) at 1. 
An identical step from U(n) to SU(n) is not possible, because the tangent 
space of U(n) at 1 is really a larger space than the tangent space to SU(n). 
Vectors X in the tangent space of SU(n) satisfy the additional condition that 
Tr(X), the trace of X, is zero. (Recall the definition from linear algebra: 
the trace of a square matrix is the sum of its diagonal entries.) 

To prove that Tr(X ) = 0 for any tangent vector X to SU(7), we use the 
following lemma about the determinant and the trace. 


Determinant of exp. For any square complex matrix A, 
det(e4) = e™(4), 


Proof. We appeal to the theorem from linear algebra that for any complex 
matrix A there is an invertible complex? matrix B and an upper triangular 
complex matrix T such that A = BTB™!. 

The nice thing about putting A in this form is that 


(BTB'\" = BTR BTR -BTB-' =BT" EB 
3The matrix B may be complex even when A is real. We then have an example of a 


phenomenon once pointed out by Jacques Hadamard: the shortest path between two real 
objects—in this case, det(e“) and e!(A)__may pass through the complex domain. 
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and hence 
Al” 
ea = Sal = 
mo 7! 


T™ 
Bl >! — ) Bl =Be"B™. 
mo Mt! 
It therefore suffices to prove det(e”) = e™ 7) for upper triangular 7, be- 
cause this implies 


det(e“) = det(Be7 B-!) = det(e?) = e™” = el(BTB™) — gH(A), 
Here we are appealing to another theorem from linear algebra, which states 
that Tr(BC) = Tr(CB) and hence Tr(BCB~') = Tr(C) (exercise). 
To obtain the value of det(e”) for upper triangular 7’, suppose that 


ti * * 
0 to * 


0 O ++ O- tin 
where the entries marked * are arbitrary. From this one can see that 


e 7” is upper triangular, with ith diagonal entry equal to #2, 
m 
ul? 


e 7’ is upper triangular, with ith diagonal entry equal to t 


e e’ is upper triangular, with ith diagonal entry equal to e“’, 


and hence 


det(e”) —_ lll @l22 as - etn — ell tht +tnn —_ em. 


as required. 


Tangent space of SU(n). The tangent space of SU(n) consists of alln xn 
complex matrices X such that X +X T — 0 and TeX) =0. 


Proof. Elements of SU(n) are, by definition, matrices A € U(n) with 
det(A) = 1. We know that the A € U(n) are of the form e* with X +X =0. 
The extra condition det(A) = 1 is therefore equivalent to 


1 = det(A) = det(e*) = e™®) 
by the theorem just proved. It follows that, given any A € U(n), 
A€SU(n) & det(A) =1 & e™™ =1 & Tr(X) =0. 
Thus the tangent space of SU(n) consists of the n x n complex matrices X 
such that X +X = 0 and Tr(X) =0. 
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Exercises 


Another proof of the crucial result det(e4) = el(4) uses less linear algebra but 
more calculus. It goes as follows (if you need help with the details, see Tapp 
[2005], p. 72 and p. 88). 

Suppose B(t) is a smooth path of n x n complex matrices with B(0) = 1, let 
b;j(t) denote the entry in row i and column j of B(t), and let Bj;(t) denote the 
result of omitting row i and column j. 


5.3.1 Show that , 
det(B(t)) =}! (—1)/*'b1 j(¢) det(B1 (4), 


j= 


a 


and hence 


d n 


i ec a aes 0) det(By ;(0)) +b4,(0) “ 


det(B (0) ‘ 


t=0 


5.3.2 Deduce from Exercise 5.3.1, and the assumption B(0) = 1, that 


gh det(B(t)) = bi ,(0) + “ 


det(B11(t)). 
t=0 


5.3.3, Deduce from Exercise 5.3.2, and induction, that 


d 


Ti detB@)) = B11 (0) + B2(0) +--+ Bin (0) = Tr(B(0)). 


t=0 


We now apply Exercise 5.3.3 to the smooth path B(t) = e!4, for which B’(0) = A, 
and the smooth real function 


f(t) =det(e4), forwhich f(0) =1. 


By the definition of derivative, 


1 
f(t) = lim = [act(e*") — det(e'4)] . 


5.3.4 Using the property det(MN) = det(M) det(N) and Exercise 5.3.3, show that 


5.3.5 Solve the equation for f(t) in Exercise 5.3.4 by setting f(t) = g(r)e"™4) 
and showing that g’(r) = 0, hence g(t) = 1. (Why?) 
Conclude that det(e4) = e™4), 
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The tangent space of SU(2) should be the same as the space Ri+ Rj + Rk 
shown in Section 4.2 to be mapped onto SU(2) by the exponential function. This 
is true, but it requires some checking. 


5.3.6 Show that the skew-Hermitian matrices in the tangent space of SU(2) can 
be written in the form bi+ cj-+dk, where b,c,d € R and i, j, and k are 
matrices with the same multiplication table as the quaternions i, j, and k. 


5.3.7 Also find the tangent space of Sp(1) (which should be the same). 


Finally, it should be checked that Tr(XY) = Tr(YX), as required in the proof 
that det(e*) = eA), This can be seen almost immediately by meditating on the 
sum 


NU1V1L 1 X12Y21 1 TX InYnl 
TX21Y12 1 X22V22 1 °° * 1 X2nYn2 
TXnl Yin + Xn2Y2n + °° * + XnnYnn- 


5.3.8 Interpret this sum as both Tr(XY) and Tr(YX). 


5.4 Algebraic properties of the tangent space 


If G is any matrix group, we can define its tangent space at the identity, 
T,(G), to be the set of matrices of the form X = A’(0), where A(t) is a 
smooth path in G with A(0) = 1. 


Vector space properties. 7)(G) is a vector space over R; that is, for any 
X,Y € ™(G) we have X +Y € T(G) and rX € T;(G) for any real r. 


Proof. Suppose X = A’(0) and Y = B’(0) for smooth paths A(r), B(t) € G 
with A(0) = B(O) = 1, so X,Y € T%(G). It follows that C(t) = A(t) B(t) is 
also a smooth path in G with C(0) = 1, and hence C’(0) is also a member 
of T,(G). 

We now compute C’(0) by the product rule and find 


c'(0) = —| A(t)B(t) = A’(0)B(0) +A(0)B'(0) 
=X+Y_ because A(0) = B(0) = 1. 


Thus X,Y € 7%(G) implies X + Y € 7%](G). 
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To see why rX € 7;(G) for any real r, consider the smooth path D(t) = 
A(rt). We have D(0) = A(0) = 1, so D'(0) € %1(G), and 


D'(0) = rA'(0) = rx. 


Hence X € 71(G) implies rX € T(G), as claimed. 


We see from this proof that the vector sum is to some extent an image 
of the product operation on G. But it is not a very faithful image, because 
the vector sum is commutative and the product on G generally is not. 

We find a product operation on 7;(G) that more faithfully reflects the 
product on G by studying the behavior of smooth paths A(s) and B(t) near 
1 when s and f vary independently. 


Lie bracket property. 7(G) is closed under the Lie bracket, that is, if 
X,Y €T(G) then [X,Y] € T,(G), where [X,Y] =XY -YX. 

Proof. Suppose A(0) = B(0) = 1, A’(0) = X,B'(0) = Y, soX, Y € T%1(G). 
Now consider the path 


C,(t) = A(s)B(t)A(s)! for some fixed value of s. 
Then C,(t) is smooth and C,(0) = 1, so C/(0) € %1(G). But also, 


Cy 


(0) = A(s)B’(0)A(s)“! = A(s)YA(s)~" 


is a smooth function of s, because A(s) is. So we have a whole smooth path 
A(s)YA(s)~! in 7(G), and hence its tangent (velocity vector) at s = 0 is 
also in 71(G). (This is because the tangent is the limit of certain elements 
of 7(G), and 7;(G) is closed under limits.) 

This tangent is found by differentiating D(s) = A(s)YA(s)~! with re- 
spect to s at s = 0 and using A(0) = 1: 


D!(0) = A'(0)YA(0)~' + A(0)Y (—A’(0)) 
=XY-—YX =[x,Y], 


since A’(0) = X and A(0) = 1. Thus X,Y € 7%4(G) implies [X,Y] € %1(G), 
as claimed. 


The tangent space of G, together with its vector space structure and 
Lie bracket operation, is called the Lie algebra of G, and from now on we 
denote it by g (the corresponding lower case Fraktur letter). 
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Definition. A matrix Lie algebra is a vector space of matrices that is closed 
under the Lie bracket [X,Y] = XY —YX. 


All the Lie algebras we have seen so far have been matrix Lie algebras, 
and in fact there is a theorem (Ado’s theorem) saying that every Lie algebra 
is isomorphic to a matrix Lie algebra. Thus it is not wrong to say simply 
“Lie algebra” rather than “matrix Lie algebra,” and we will usually do so. 

Perhaps the most important idea in Lie theory is to study Lie groups 
by looking at their Lie algebras. This idea succeeds because vector spaces 
are generally easier to work with than curved objects—which Lie groups 
usually are—and the Lie bracket captures most of the group structure. 

However, it should be emphasized at the outset that g does not always 
capture G entirely, because different Lie groups can have the same Lie 
algebra. We have already seen one class of examples. For all n, O(n) is 
different from SO(7), but they have the same tangent space at 1 and hence 
the same Lie algebra. There is a simple geometric reason for this: SO(n) 
is the subgroup of O(n) whose members are connected by paths to 1. The 
tangent space to O(n) at 1 is therefore the tangent space to SO(n) at 1. 


Exercises 


If, instead of considering the path C,(t) = A(s)B(t)A(s)~! in G we consider the 
path 
D,(t) = A(s)B(t)A(s)'B(t)~! for some fixed value of s, 


then we can relate the Lie bracket [X,Y] of X,Y € T%(G) to the so-called commu- 
tator A(s)B(t)A(s)~!B(t)~! of smooth paths A(s) and B(t) through 1 in G. 


5.4.1 Find D/(r), and hence show that D/,(0) = A(s)YA(s)~! —Y. 


5.4.2 Di(0) € T1(G) (why?) and hence, as s varies, we have a smooth path E(s) = 
D‘.(0) in %(G) (why?). 


5.4.3 Show that the velocity E’(0) equals XY — YX, and explain why E’(0) is in 
T(G). 


The tangent space at 1 is the most natural one to consider, but in fact all 
elements of G have the “same” tangent space. 


5.4.4 Show that the smooth paths through any g € G are of the form gA(t), where 
A(t) is a smooth path through 1. 


5.4.5 Deduce from Exercise 5.4.4 that the space of tangents to G at g is isomor- 
phic to the space of tangents to G at 1. 
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5.5 Dimension of Lie algebras 


Since the tangent space of a Lie group is a vector space over R, it has a 
well-defined dimension over IR. We can easily compute the dimension of 
so(n),u(n),su(n), and sp(n) by counting the number of independent real 
parameters in the corresponding matrices. 


Dimension of so(n),u(n),su(n), and sp(n). As vector spaces over R, 
(a) so(n) has dimension n(n — 1) /2. 


(b) u(n) has dimension n°. 


(c) su(n) has dimension n* — 1. 
(d) sp(n) has dimension n(2n + 1). 


Proof. (a) We know from Section 5.2 that so(n) consists of all n x n real 
skew-symmetric matrices X. Thus the diagonal entries are zero, and the 
entries below the diagonal are the negatives of those above. It follows that 
the dimension of so(n) is the number of entries above the diagonal, namely 


n(n—1) 

ies 

(b) We know from Section 5.3 that u(m) consists of all n x n complex 
skew-Hermitian matrices X. Thus X has n(n — 1) /2 complex entries above 
the diagonal and n pure imaginary entries on the diagonal, so the number 
of independent real parameters in X is 


1+2+:-+(n-1)= 


n(n—1)+n=n’. 


(c) We know from Section 5.3 that su(7) consists of all n x n complex 
skew-Hermitian matrices with Tr(X) = 0. Without the Tr(X) = 0 condi- 
tion, there are n” real parameters, as we have just seen in (b). The condition 
Tr(X ) = 0 says that the nth diagonal entry is the negative of the sum of the 
remaining diagonal entries, so the number of independent real parameters 
isn? —1., 

(d) We know from Section 5.3 that sp(m) consists of all n x n quater- 
nion skew-Hermitian matrices X. Thus X has n(n — 1) /2 quaternion entries 
above the diagonal and n pure imaginary quaternion entries on the diago- 
nal, so the number of independent real parameters is 


2n(n— 1)4+3n =n(2n—2+3) =n(2n+1). 
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It seems geometrically natural that a matrix group G should have the 
same dimension as its tangent space 74(G) at the identity, but to put this 
result on a firm basis we need to construct a bijection between a neigh- 
borhood of 1 in G and a neighborhood of 0 in 7(G), continuous in both 
directions—a homeomorphism. This can be achieved by a deeper study of 
the exponential function, which we carry out in Chapter 7 (for other pur- 
poses). But then one faces the even more difficult problem of proving the 
invariance of dimension under homeomorphisms. Fortunately, Lie theory 
has another way out, which is simply to define the dimension of a Lie group 
to be the dimension of its Lie algebra. 


Exercises 


The extra dimension that U(n) has over SU(n) is reflected in the fact that the quo- 
tient group U(n) /SU(n) exists and is isomorphic to the circle group S'. Among 
other things, this shows that U(7) is not a simple group. Here is how to show that 
the quotient exists. 


5.5.1 Consider the determinant map det : U(n) — C. Why is this a homomor- 
phism? What is its kernel? 


5.5.2 Deduce from Exercise 5.5.1 that SU(m) is a normal subgroup of U(n). 


Since the dimension of U() is 1 greater than the dimension of SU(n), we 
expect the dimension of U(n)/SU(n) to be 1. The elements of U(n) /SU(n) cor- 
respond to the values of det(A), for matrices A € U(n), by the homomorphism 
theorem of Section 2.2. So these values should form a 1-dimensional group— 
isomorphic to either R or S!. Indeed, they are points on the unit circle in C, as the 
following exercises show. 


5.5.3 If A is ann x n complex matrix such that AA’ =1, show that |det(A)| = 1. 


5.5.4 Give an example of a diagonal unitary matrix A, with det(A) = e’®. 


5.6 Complexification 


The Lie algebras we have constructed so far have been vector spaces over 
R, even though their elements may be matrices with complex or quaternion 
entries. Each element is an initial velocity vector A’(0) of a smooth path 
A(t), which is a function of the real variable t. It follows that, along with 
each velocity vector A’(0), we have its real multiples rA‘(0) for each r € R, 
because they are the initial velocity vectors of the paths A(rt). Thus the 
elements A’(0) of the Lie algebra admit multiplication by all real numbers 
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but not necessarily by all complex numbers. One can easily give examples 
(Exercise 5.6.1) in which a complex matrix A is in a certain Lie algebra but 
iA is not. 

However, it is certainly possible for a Lie algebra to be a vector space 
over C. Indeed, any real matrix Lie algebra g over R has a complexification 


g+ig={A+iB:A,B€g} 


that is a vector space over C. It is clear that g-+ ig is closed under sums, 
because g is, and it is closed under multiples by complex numbers because 


(a+ ib)(A +iB) = aA —bB +i(bA +aB) 


and aA — bB,bA+aB € g for any real numbers a and b. 
Also, g + ig is closed under the Lie bracket because 


[Ay + By, Az + iB] = [A1,A2] — (Bi, Bz] +i ([B1,A2] + [A1,Bo]) 


by bilinearity, and [A1,Ao], |B1,B2],|B1,A2],|A1,B2] € g by the closure of g 
under the Lie bracket. Thus g + ig is a Lie algebra. 

Complexifying the Lie algebras u(m) and su(n), which are not vector 
spaces over C, gives Lie algebras that happen to be tangent spaces—of the 
general linear group GL(n,C) and the special linear group SL(n,C). 


GL(1,C) and its Lie algebra gl(n, C) 


The group GL(n,C) consists of all n x n invertible complex matrices A. It 
is clear that the initial velocity A’(0) of any smooth path A(t) in GL(n,C) is 
itself ann x n complex matrix. Thus the tangent space gl(n,C) of GL(n,C) 
is contained in the space M,,(C) of all n x n complex matrices. 

In fact, gl(n,C) = M,,(C). We first observe that exp maps M,,(C) into 
GL(n,C) because, for any X € M,(C) we have 


e e* is ann Xx ncomplex matrix. 


Xx 


e e* is invertible, because it has e~* 


as its inverse. 

It follows, since tX € M,,(C) for any X € M,(C) and any real r, that e is 
a smooth path in GL(n,C). Then X is the tangent vector to this path at 1, 
and hence the tangent space gl(n,C) equals M,,(C), as claimed. 
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Now we show why gl(n,C) is the complexification of u(n): 
gl(n,C) = M,(C) = u(n) + iu(n). 


It is clear that any member of u(n) + iu(n) is in M,(C). So it remains to 
show that any X € M,,(C) can be written in the form 


X =X, +iX. where X1,X2 €u(n), (*) 


that is, where X; and Xz are skew-Hermitian. There is a surprisingly simple 
way to do this: 
X-X'  X+X" 


7 1 3 


=T oT 
We leave it as an exercise to check that X; = — and X> = xis satisfy 


X, +X) =0 =X) +X", which completes the proof. 

As a matter of fact, for each X € gl(N,C) the equation (*) has a unique 
solution with X,,X2 € u(n). One solves (*) by first taking the conjugate 
transpose of both sides, then forming 


0A 2G He 
= i(X2 =e because X; +X, —0 
—2iX, because X»+X> —0. 
X-X' =%—X% +i +R ) 
—X,—¥| because X» +X) —0 
—2X, because X; +X —0. 


oT oT 
Thus X; = <* and X) = —S are in fact the only values X,Xz € u(n) 


that satisfy (**). 


SL(n, C) and its Lie algebra sl(n, C) 


The group SL(n,C) is the subgroup of GL(n,C) consisting of the n x n 
complex matrices A with det(A) = 1. The tangent vectors of SL(n,C) are 
among the tangent vectors X of GL(n,C), but they satisfy the additional 
condition Tr(X) = 0. This is because e* € GL(n,C) and 


det(e*) = e™™) = 1. Tr(X) =0. 
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Conversely, if X has trace zero, then so has tX for any real t, so a 
matrix X with trace zero gives a smooth path e* in SL(n,C). This path 
has tangent X at 1, so 


sl(a,C) ={X €M,(C) = TX) =0}: 


We now show that the latter set of matrices is the complexification of 
su(n), su(n) + isu(n). Since any X € su(n) has trace zero, any member of 
su(n) + isu(n) also has trace zero. Conversely, any X € M,(C) with trace 
zero can be written as 


X =X,+iX2, where X1,X2 € su(n). 
We use the same trick as for u(m) + iu(n); namely, write 


X¥-x¥ 4x" 
= +1 . 


Xx 
2 2i 


As before, X; = xx and X> = xa" are skew-Hermitian. But also, X; 
and X> have trace zero, because X has. 

Thus, sl(V,C) = {X € M,(C) : Tr(X) = 0} = su(n) + isu(n), as 
claimed. 

Also, by an argument like that used above for gl(n,C), each X €sl(n, C) 


corresponds to a unique ordered pair X|, X2 of elements of su(n) such that 


X =X, + 1X). 


This equation therefore gives a 1-to-1 correspondence between the ele- 
ments X of sl(n,C) and the ordered pairs (X1,X2) such that X,X2 € su(n). 


Exercises 


5.6.1 Show that u() and su(n) are not vector spaces over C. 


x-x! x4x! a 
5.6.2 Check that X; = 5— and X7 = “+— are skew-Hermitian, and that X; and 
X> have trace zero when X has. 


5.6.3 Show that the groups GL(n,C) and SL(n,C) are unbounded (noncompact) 
when the matrix with (j,k)-entry (a jx + ibj,) is identified with the point 


2 2 
(441,011,412,012,---,@in,Din,-»»,@nn,Dnn) € R™ 


: Bc cd i Sql 2 
and distance between matrices is the usual distance between points in R7” . 
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The following exercises show that the matrix A = Ca |) in SL(2,C) is not 
equal to e* for any X € sl(2,C), the 2 x 2 matrices with trace zero. Thus exp does 
not map the tangent space onto the group in this case. The idea is to calculate e* 
explicitly with the help of the Cayley-Hamilton theorem, which for 2 x 2 matrices 
X says that 

X? — (Tr(X))X + det(X)1 = 0. 


Therefore, when Tr(X) = 0 we have X* = —det(X)1. 
5.6.4 When X? = — det(X)1, show that 


e* = cos(/det(X))1 + : 


5.6.5 Using Exercise 5.6.4, and the fact that Tr(X) = 0, show that if 


-1 1 
X 
Pa 4 
then cos(,/det(X)) = —1, in which case sin(,/det(X )) = 0, and there is a 
contradiction. 


5.6.6 It follows not only that exp does not map sl(2,C) onto SL(2,C) but also 
that exp does not map gl(2,C) onto GL(2,C). Why? 


This is not our first example of a Lie algebra that is not mapped onto its 
group by exp. We have already seen that exp cannot map o0(7) onto O(n) because 
o(n) is path-connected and O(n) is not. What makes the sl(n,C) and gl(n,C) 
examples so interesting is that SL(n,C) and GL(n,C) are path-connected. We 
gave some results on path-connectedness in Sections 3.2 and 3.8, and will give 
more in Section 8.6, including a proof that GL(n,C) is path-connected. 


5.6.7 Find maximal tori, and hence the centers, of GL(n,C) and SL(,C). 


5.6.8 Assuming path-connectedness, also find their discrete normal subgroups. 


5.7 Quaternion Lie algebras 


Analogous to GL(n,C), there is the group GL(n, H) of all invertible n x n 
quaternion matrices. Its tangent vectors lie in the space M,,(H) of all 
n Xn quaternion matrices, and indeed each X € M,,(H) is a tangent vec- 
tor, because the quaternion matrix e’* has the inverse e“ and hence lies 
in GL(n,H). So, for each X € M,,(H) we have the smooth path e™ in 
GL(n,H) with tangent X. 

Thus the Lie algebra gl(n,H) of GL(n,H) is precisely M(H). 
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However, there is no “sl(n,H)” of quaternion matrices of trace zero. 
This set of matrices is closed under sums and scalar multiples but, because 
of the noncommutative quaternion product, not under the Lie bracket. For 
example, we have the following matrices of trace zero in M2(H): 


But their Lie bracket is 


k 0 —-k 0O k 0 
xy -¥x= (6 J - Co sea. a) 
which does not have trace zero. 


The quaternion Lie algebra that interests us most is sp(n), the tangent 
space of Sp(n). As we found in Section 5.3, 


sp(n) = {X €M,(H):X+X_ =0}, 


where X denotes the result of replacing each entry of X by its quaternion 
conjugate. 

There is no neat relationship between sp(n) and gl(n,H) analogous 
to the relationship between su(n) and sl(n,C). This can be seen by con- 
sidering dimensions: gl(n,H) has dimension 4n* over R, whereas sp(n) 
has dimension 2n? +n, as we saw in Section 5.5. Therefore, we cannot 
decompose gl(n, HI) into two subspaces that look like sp(m), because the 
dimensions do not add up. 

As a result, we need to analyze sp(n) from scratch, and it turns out to 
be “simpler” than gl(n,H1), in a sense we will explain in Section 6.6. 


Exercises 
5.7.1 Give three examples of subspaces of gl(n,H) closed under the Lie bracket. 
5.7.2 What are the dimensions of your examples? 


5.7.3 If your examples do not include one of real dimension 1, give such an ex- 
ample. 


5.7.4 Also, if you have not already done so, give an example g of dimension n 
that is commutative. That is, [X,Y] = 0 for all X,Y € g. 
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5.8 Discussion 


The classical groups were given their name by Hermann Wey] in his 1939 
book The Classical Groups. Wey] did not give a precise enumeration of the 
groups he considered “classical,” but it seems plausible from the content of 
his book that he meant the general and special linear groups, the orthogonal 
groups, and the unitary and symplectic groups. Weyl briefly mentioned 
that the concept of orthogonal group can be extended to include the group 
O(p,q) of transformations of R?*4 preserving the (not positive definite) 
inner product defined by 


/ / / / / / 
(U1, U2,.-.,Up,Uy Ua, --+,Ug) * (V1, V25+-+)VpsVi,V25-++9Vq) 


rd oe 
= UV] + UQV2 + +++ + UpVp — UV] — UV — ++ — UGVg. 


An important special case is the Lorentz group O(1,3), which defines 
the geometry of Minkowski space—the “spacetime” of special relativity. 
There are also “p,q generalizations” of the unitary and symplectic groups, 
and today these groups are often considered “classical.” However, in this 
book we apply the term “classical groups” only to the general and special 
linear groups, and O(n), SO(n), U(n), SU(n), and Sp(n). 

Wey] also introduced the term “Lie algebra” (in lectures at Princeton in 
1934-35, at the suggestion of Nathan Jacobson) for the collection of what 
Lie had called the “infinitesimal elements of a continuous group.” 

The Lie algebras of the classical groups were implicitly known by Lie. 
However, the description of Lie algebras by matrices was taken up only 
belatedly, alongside the late-dawning realization that linear algebra is a 
fundamental part of mathematics. As we have seen, the serious study of 
matrix Lie groups began with von Neumann [1929], and the first examples 
of nonmatrix Lie groups were not given until 1936. At about the same 
time, I. D. Ado showed that linear algebra really is an adequate basis for 
the theory of Lie algebras, in the sense that any Lie algebra can be viewed 
as a vector space of matrices. 

As late as 1946, Chevalley thought it worthwhile to point out why it is 
convenient to view elements of matrix groups as exponentials of elements 
in their Lie algebras: 


The property of a matrix being orthogonal or unitary is defined 
by asystem of nonlinear relationships between its coefficients; 
the exponential mapping gives a parametric representation of 
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the set of unitary (or orthogonal) matrices by matrices whose 
coefficients satisfy linear relations. 


Chevalley [1946] is the first book, as far as I know, to explicitly describe 
the Lie algebras of orthogonal, unitary, and symplectic groups as the spaces 
of skew-symmetric and skew-Hermitian matrices. 

The idea of viewing the Lie algebra as the tangent space of the group 
goes back a little further, though it did not spring into existence fully 
grown. In von Neumann [1929], elements of the Lie algebra of a ma- 
trix groups G are taken to be limits of sequences of matrices in G, and von 
Neumann’s limits can indeed be viewed as tangents, though this fact is not 
immediately obvious (see Section 7.3). The idea of defining tangent vec- 
tors to G via smooth paths in G seems to originate with Pontrjagin [1939], 
p. 183. The full-blooded definition of Lie groups as smooth manifolds and 
Lie algebras as their tangent spaces appears in Chevalley [1946]. 

In this book I do not wish to operate at the level of generality that 
requires a definition of smooth manifolds. However, a few remarks are 
in order, since the concept of smooth manifold includes some objects that 
do not look “smooth” at first sight. For example, a single point is smooth 
and so is any finite set of points. This has the consequence that {1,—1} 
is a smooth subgroup of SU(2), and also of SO(n) for any even n. The 
reason is that a smooth group should have a tangent space at every point, 
but nobody said the tangent space has to be big! 

“Smoothness” of a k-dimensional group G should imply that G has a 
tangent space isomorphic to R* at 1 (and hence at any point), but this in- 
cludes the possibility that the tangent space is R° = {0}. We must therefore 
accept groups as “smooth” if they have zero tangent space at 1, which is 
the case for {1}, {1,—1}, and any other finite group. In fact, finite groups 
are included in the definition of “matrix Lie group” stated in Section 1.1, 
since they are closed under nonsingular limits. 

Nevertheless, the presence of nontrivial groups with zero tangent space, 
such as {1,1}, complicates the search for simple groups. If a group G is 
simple, then its tangent space g is a simple Lie algebra, in a sense that will 
be defined in the next chapter. Simple Lie algebras are generally easier to 
recognize than simple Lie groups, so we find the simple Lie algebras g first 
and then see what they tell us about the group G. A good idea—except that 
g cannot “see” the finite subgroups of G, because they have zero tangent 
space. Simplicity of g therefore does not rule out the possibility of finite 
normal subgroups of G, because they are “invisible” to g. This is why we 
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took the trouble to find the centers of various groups in Chapter 3. It turns 
out, as we will show in Chapter 7, that g can “see” all the normal subgroups 
of G except those that lie in the center, so in finding the centers we have 
already found all the normal subgroups. 

The pioneers of Lie theory, such as Lie himself, were not troubled by 
the subtle difference between simplicity of a Lie group and simplicity of its 
Lie algebra. They viewed Lie groups only locally and took members of the 
Lie algebra to be members of the Lie group anyway (the “infinitesimal” el- 
ements). For the pioneers, the problem was to find the simple Lie algebras. 
Lie himself found almost all of them, as Lie algebras of classical groups. 
But finding the remaining simple Lie algebras—the so-called exceptional 
Lie algebras—was a monumentally difficult problem. Its solution by Wil- 
helm Killing around 1890, with corrections by Elie Cartan in 1894, is now 
viewed as one of the greatest achievements in the history of mathematics. 

Since the 1920s and 1930s, when Lie groups came to be viewed as 
global objects and Lie algebras as their tangent spaces at 1, the question of 
what to say about simple Lie groups has generally been ignored or fudged. 
Some authors avoid saying anything by defining a simple Lie group to be 
one whose Lie algebra is simple, often without pointing out that this con- 
flicts with the standard definition of simple group. Others (such as Bour- 
baki [1972]) define a Lie group to be almost simple if its Lie algebra is 
simple, which is another way to avoid saying anything about the genuinely 
simple Lie groups. 

The first paper to study the global properties of Lie groups was Schreier 
[1925]. This paper was overlooked for several years, but it turned out to 
be extremely prescient. Schreier accurately identified both the general role 
of topology in Lie theory, and the special role of the center of a Lie group. 
Thus there is a long-standing precedent for studying Lie group structure as 
a topological refinement of Lie algebra structure, and we will take up some 
of Schreier’s ideas in Chapters 8 and 9. 


6 


Structure of Lie algebras 


PREVIEW 


In this chapter we return to our original motive for studying Lie algebras: 
to understand the structure of Lie groups. We saw in Chapter 2 how normal 
subgroups help to reveal the structure of the groups SO(3) and SO(4). To 
go further, we need to know exactly how the normal subgroups of a Lie 
group G are reflected in the structure of its Lie algebra g. 

The focus of attention shifts from groups to algebras with the following 
discovery. The tangent map from a Lie group G to its Lie algebra g sends 
normal subgroups of G to substructures of g called ideals. Thus the ideals 
of g “detect” normal subgroups of G in the sense that a nontrivial ideal of 
g implies a nontrivial normal subgroup of G. 

Lie algebras with no nontrivial ideals, like groups with no nontrivial 
normal subgroups, are called simple. It is not quite true that simplicity of 
g implies simplicity of G, but it turns out to be easier to recognize simple 
Lie algebras, so we consider that problem first. 

We prove simplicity for the “generalized rotation” Lie algebras so(n) 
for n > 4, su(n), sp(n), and also for the Lie algebra of the special linear 
group of C”. The proofs occupy quite a few pages, but they are all vari- 
ations on the same elementary argument. It may help to skip the details 
(which are only matrix computations) at first reading. 


116 J. Stillwell, Naive Lie Theory, DOI: 10.1007/978-0-387-78214-0_6, 
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6.1 Normal subgroups and ideals 


In Chapter 5 we found the tangent spaces of the classical Lie groups: the 
classical Lie algebras. In this chapter we use the tangent spaces to find 
candidates for simplicity among the classical Lie groups G. We do so by 
finding substructures of the tangent space g that are tangent spaces of the 
normal subgroups of G. These are the ideals,’ defined as follows. 


Definition. An ideal h of a Lie algebra g is a subspace of g closed under 
Lie brackets with arbitrary members of g. That is, if Y € h and X € g then 
[X,Y] eb. 


Then the relationship between normal subgroups and ideals is given by 
the following theorem. 


Tangent space of a normal subgroup. /f H is a normal subgroup of a 
matrix Lie group G, then T,(H) is an ideal of the Lie algebra Ty(G). 


Proof. 7;(H) is a vector space, like any tangent space, and it is a subspace 
of 7;(G) because any tangent to H at 1 is a tangent to G at 1. Thus it 
remains to show that 7;(H) is closed under Lie brackets with members of 
T,(G). To do this we use the property of a normal subgroup that B € H and 
A € Gimplies ABA~! € H. 

It follows that A(s)B(t)A(s)~! is a smooth path in H for any smooth 
paths A(s) in G and B(r) in H. As usual, we suppose A(0) = 1 = B(0), so 
A’(0) = X € T%](G) and B’(0) = Y € % (A). If we let 


Cs(t) = A(s)B(t)A(s), 
then it follows as in Section 5.4 that 


D(s) =C, 


Ss 


(0) = A(s)YA(s)~! 


‘This terminology comes from algebraic number theory, via ring theory. In the 1840s, 
Kummer introduced some objects he called “ideal numbers” and “ideal primes” in order to 
restore unique prime factorization in certain systems of algebraic numbers where ordinary 
prime factorization is not unique. Kummer’s “ideal numbers” did not have a clear meaning 
at first, but in 1871 Dedekind gave them a concrete interpretation as certain sets of numbers 
closed under sums, and closed under products with all numbers in the system. In the 1920s, 
Emmy Noether carried the concept of ideal to general ring theory. Roughly speaking, a 
ring is a set of objects with sum and product operations. The sum operation satisfies the 
usual properties of sum (commutative, associative, etc.) but the product is required only 
to “distribute” over sum: a(b+c) = ab-+ac. A Lie algebra is a ring in this general sense 
(with the Lie bracket as the “product” operation), so Lie algebra ideals are included in the 
general concept of ideal. 
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is a smooth path in 7;(H). It likewise follows that 


D'(0) =XY -YX €7;(H), 


and hence 74(H) is an ideal, as claimed. 


Remark. In Section 7.5 we will sharpen this theorem by showing that 
T1(H) 4 {0} provided H is not discrete, that is, provided there are points 
in H not equal to 1 but arbitrarily close to it. Therefore, if g has no ideals 
other than itself and {0}, then the only nontrivial normal subgroups of G 
are discrete. We saw in Section 3.8 that any discrete normal subgroup of 
a path-connected group G is contained in Z(G), the center of G. For the 
generalized rotation groups G (which we found to be path-connected in 
Chapter 3, and which are the main candidates for simplicity), we already 
found Z(G) in Section 3.7. In each case Z(G) is finite, and hence discrete. 


This remark shows that the Lie algebra g = 7;(G) can “see” normal 
subgroups of G that are not too small. 7;(G) retains an image of a normal 
subgroup H as an ideal 7;(H), which is “visible” (1;(H) 4 {0}) provided 
H is not discrete. Thus, if we leave aside the issue of discrete normal 
subgroups for the moment, the problem of finding simple matrix Lie groups 
essentially reduces to finding the Lie algebras with no nontrivial ideals. 

In analogy with the definition of simple group (Section 2.2), we define 
a simple Lie algebra to be one with no ideals other than itself and {0}. 
By the remarks above, we can make a big step toward finding simple Lie 
groups by finding the simple Lie algebras among those for the classical 
groups. We do this in the sections below, before returning to Lie groups to 
resolve the remaining difficulties with discrete subgroups and centers. 


Simplicity of s0(3) 


We know from Section 2.3 that SO(3) is a simple group, so we do not 
really need to investigate whether so(3) is a simple Lie algebra. However, 
it is easy to prove the simplicity of s0(3) directly, and the proof is a model 
for proofs we give for more complicated Lie algebras later in this chapter. 

First, notice that the tangent space so(3) of SO(3) at 1 is the same as 
the tangent space su(2) of SU(2) at 1. This is because elements of SO(3) 
can be viewed as antipodal pairs +q of quaternions g in SU(2). Tangents 
to SU(2) are determined by the g near 1, in which case —g is not near 1, 
so the tangents to SO(3) are the same as the tangents to SU(2). 
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Thus the Lie algebra s0(3) equals su(2), which we know from Section 
4.4 is the cross-product algebra on R*. (Another proof that s0(3) is the 
cross-product algebra on R° is in Exercises 5.2.1-5.2.3.) 


Simplicity of the cross-product algebra. The cross-product algebra is 
simple. 


Proof. It suffices to show that any nonzero ideal equals R* = Ri+Rj+Rk, 
where i, j, and k are the usual basis vectors for R?. 

Suppose that 3 is an ideal, with a nonzero member u = xi+ yj-+ zk. 
Suppose, for example, that x #0. By the definition of ideal, 3 is closed 
under cross products with all elements of R°. In particular, 


uxjoHxk—-zaed, 


and hence 
(xk — zi) x i=xj € 3. 


Then x! (xj) =j € J also, since J is a subspace. It follows, by taking cross 
products with k and i, that i,k € 3 as well. 

Thus 3 is a subspace of IR? that includes the basis vectors i, j, and k, 
so J = R*. There is a similar argument if y 4 0 or z 4 0, and hence the 
cross-product algebra on R? is simple. 


The algebraic argument above—nullifying all but one component of 
a nonzero element to show that a nonzero ideal 3 includes all the basis 
vectors—is the model for several simplicity proofs later in this chapter. The 
later proofs look more complicated, because they involve Lie bracketing 
of a nonzero matrix to nullify all but one basis element (which may be a 
matrix with more than one nonzero entry). But they similarly show that a 
nonzero ideal includes all basis elements, and hence is the whole algebra, 
so the general idea is the same. 


Exercises 


Another way in which 7;(G) may misrepresent G is when 7y(H) = 7%](G) but H 
is not all of G. 


6.1.1 Show that 7(O(n)) = 7j(SO(n)) for each n, and that SO(n) is a normal 
subgroup of O(7). 


6.1.2. What are the cosets of SO(n) in O(n)? 
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An example of a matrix Lie group with a nontrivial normal subgroup is U(7). 
We determined the appropriate tangent spaces in Section 5.3. 


6.1.3 Show that SU(n) is a normal subgroup of U(n) by describing it as the kernel 
of a homomorphism. 


6.1.4 Show that 7(SU(n)) is an ideal of 7(U(n)) by checking that it has the 
required closure properties. 


6.2 Ideals and homomorphisms 


If we restrict attention to matrix Lie groups (as we generally do in this 
book) then we cannot assume that every normal subgroup H of a Lie group 
G is the kernel of a matrix group homomorphism G — G/H. The problem 
is that the quotient G/H of matrix groups is not necessarily a matrix group. 
This is why we derived the relationship between normal subgroups and 
ideals without reference to homomorphisms. 

Nevertheless, some important normal subgroups are kernels of matrix 
Lie group homomorphisms. One such homomorphism is the determinant 
map G — C%, where C% denotes the group of nonzero complex numbers 
(or 1 x 1 nonzero complex matrices) under multiplication. Also, any ideal 
is the kernel of a Lie algebra homomorphism—defined to be a map of 
Lie algebras that preserves sums, scalar multiples, and the Lie bracket— 
because in fact any Lie algebra is isomorphic to a matrix Lie algebra. 

An important Lie algebra homomorphism is the trace map, 


Tr(A) = sum of diagonal elements of A, 


for real or complex matrices A. We verify that Tr is a Lie algebra homo- 
morphism in the next section. 
The general theorem about kernels is the following. 


Kernel of a Lie algebra homomorphism. /f 9 : g — g’ is a Lie algebra 
homomorphism, and 


b= {X €g: (X) =0} 


is its kernel, then b is an ideal of g. 
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Proof. Since @ preserves sums and scalar multiples, h is a subspace: 


X1,X2 €h = (Xi) = 0,—(X2) =0 
=> ~(X;+X2)=0 because @ preserves sums 
=> X,+X2 5, 
Xe€h= o(x)=0 
=>co(x)=0 
= ~(cX) =0 because @ preserves scalar multiples 
=>cX Eh. 


Also, h is closed under Lie brackets with members of g because 


X €b=> 9(X)=0 
= 9([X,¥]) =[9(X), e(Y)] = [0,9(Y)] =0 
for any Y € g because @ preserves Lie brackets 
= [X,Y]€bh foranyY €g. 


Thus 6 is an ideal, as claimed. 


It follows from this theorem that a Lie algebra is not simple if it admits 
a nontrivial homomorphism. This points to the existence of non-simple Lie 
algebras, which we should look at first, if only to know what to avoid when 
we search for simple Lie algebras. 


Exercises 


There is a sense in which any homomorphism of a Lie group G “induces” a homo- 
morphism of the Lie algebra T)(G). We study this relationship in some depth in 
Chapter 9. Here we explore the special case of the det homomorphism, assuming 
also that G is a group for which exp maps 71 (G) onto G. 


6.2.1 If we map each X € 7(G) to Tr(X), where does the corresponding member 
e* of G go? 


6.2.2. If we map each e* € G to det(e* ), where does the corresponding X € Tj (G) 
go? 


6.2.3 In particular, why is there a well-defined image of X when e* = ex? 
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6.3 Classical non-simple Lie algebras 


We know from Section 2.7 that SO(4) is not a simple group, so we expect 
that so(4) is not a simple Lie algebra. We also know, from Section 5.6, 
about the groups GL(n,C) and their subgroups SL(n,C). The subgroup 
SL(n,C) is normal in GL(n,C) because it is the kernel of the homomor- 
phism 
det : GL(n,C) — C%. 

It follows that GL(n,C) is not a simple group for any 1, so we expect that 
gl(n,C) is not a simple Lie algebra for any n. We now prove that these Lie 
algebras are not simple by finding suitable ideals. 


An ideal in gl(1,C) 


We know from Section 5.6 that gl(n,C) = M,(C) (the space of all n xn 
complex matrices), and sl(n,C) is the subspace of all matrices in M,,(C) 
with trace zero. This subspace is an ideal, because it is the kernel of a Lie 
algebra homomorphism. 

Consider the trace map 


Tr: M,(C) > C. 


The kernel of this map is certainly sl(n,C), but we have to check that this 
map is a Lie algebra homomorphism. It is a vector space homomorphism 
because 


Tr(X + Y) =Tr(X)+Tr(Y) and Tr(zX)=2zTr(X) for any zE C, 


as is clear from the definition of trace. 

Also, if we view C as the Lie algebra with trivial Lie bracket [u,v] = 
uv — vu = 0, then Tr preserves the Lie bracket. This is due to the (slightly 
less obvious) property that Tr(XY) = Tr(YX), which can be checked by 
computing both sides (see Exercise 5.3.8). Assuming this property of Tr, 
we have 


Tr ((X,Y]) = Tr(XY —YX) 
= Tr(XY) —Tr(YX) 
=0 
= [Tr(X), Tr(¥)]. 
Thus Tr is a Lie bracket homomorphism and its kernel, sl(n,C), is neces- 
sarily an ideal of M,(C) = gl(n,C). 
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An ideal in so(4) 


In Sections 2.5 and 2.7 we saw that every rotation of H = R* is a map of 
the form g++ v-!qw, where u,v € Sp(1) (the group of unit quaternions, 
also known as SU(2)). In Section 2.7 we showed that the map 


® : Sp(1) x Sp(1) — SO(4) 


that sends (v,w) to the rotation g++ v~!qw is a 2-to-1 homomorphism onto 
SO(4). This is a Lie group homomorphism, so by Section 6.1 we expect it 
to induce a Lie algebra homomorphism onto s0(4), 


@: sp(1) x sp(1) — 50(4), 


because sp(1) x sp(1) is surely the Lie algebra of Sp(1) x Sp(1). Indeed, 
any smooth path in Sp(1) x Sp(1) has the form u(t) = (v(t), w(t)), so 


w'(0) = (v'(0),w"(0)) € sp(1) x sp(1). 


And as (v(t), w(t)) runs through all pairs of smooth paths in Sp(1) x Sp(1), 
(v’(0),w’(0)) runs through all pairs of velocity vectors in sp(1) x sp(1). 

Moreover, the homomorphism 9@ is 1-to-1. Of the two pairs (v(t), w(t)) 
and (—v(t), —w(t)) that map to the same rotation q+ v(t)~!qw(t), exactly 
one goes through the identity 1 when ¢t = 0 (the other goes through —1). 
Therefore, the two pairs between them yield only one velocity vector in 
sp(1) x sp(1), either (v’(0),w’(0)) or (—v’(0),—w’(0)). Thus @ is in fact 
an isomorphism of sp(1) x sp(1) onto so(4). (For a matrix description of 
this isomorphism, see Exercise 6.5.4.) 

But sp(1) x sp(1) has a homomorphism with nontrivial kernel, namely, 


(v'(0),w’(0)) + (0,w/(0)), with kernel sp(1) x {0}. 


The subspace sp(1) x {0} is therefore a nontrivial ideal of so(4). Since 
sp(1) is isomorphic to s0(3), and so(3) x {0} is isomorphic to s0(3), this 
ideal can be viewed as an s0(3) inside s0(4). 


Exercises 


A more concrete proof that s!(n,C) is an ideal of gl(n,C) can be given by checking 
that the matrices in sl(n,C) are closed under Lie bracketing with any member of 
gl(n,C). In fact, the Lie bracket of any two elements of gl(n,C) lies in sl(n,C), 
as the following exercises show. 
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We let 
X11 X12 «++ Xin 
X21 X22. «++ X2n 
X=]. ; 
Xnl Xn2 +++ Xnn 


be any element of gl(n,C), and consider its Lie bracket with e; j> the matrix with 
1 as its (i, j)-entry and zeros elsewhere. 


6.3.1 Describe Xe;; and e;;X. Hence show that the trace of [X,e;;] is xj; —x ji = 0. 
6.3.2 Deduce from Exercise 6.3.1 that Tr([X,¥]) = 0 for any X,Y € gl(n,C). 
6.3.3 Deduce from Exercise 6.3.2 that sl(n,C) is an ideal of gl(n,C). 


Another example of a non-simple Lie algebra is u(n), the algebra of n x n 
skew-hermitian matrices. 


6.3.4 Find a 1-dimensional ideal J in u(n), and show that 3 is the tangent space 
of Z(U(n)). 


6.3.5 Also show that the Z(U(n)) is the image, under the exponential map, of the 
ideal J in Exercise 6.3.4. 


6.4 Simplicity of s{(n,C) and su(n) 


We saw in Section 5.6 that sl(n,C) consists of all n x n complex matrices 
with trace zero. This set of matrices is a vector space over C, and it has 
a natural basis consisting of the matrices e;; for i # j and ej — ny for 
i=1,2,...,n—1, where e;; is the matrix with | as its (i, )-entry and zeros 
elsewhere. These matrices span sl(n,C). In fact, for any X € sl(n,C), 


n—1 


= (te) = ale yy Xii (ey a Cin) 
Aj 


because Xyn = —X11 —X22 —-** —Xn—1,n—1 for the trace of X to be zero. Also, 
X is the zero matrix only if all the coefficients are zero, so the matrices e;; 
for i ~ j and e; — en, for i= 1,2,...,n—1 are linearly independent. 

These basis elements are convenient for Lie algebra calculations be- 
cause the Lie bracket of any X with an e;; has few nonzero entries. This 
enables us to take any nonzero member of an ideal ws and manipulate it 
to find a nonzero multiple of each basis element in 3, thus showing that 

s((n,C) contains no nontrivial ideals. 
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Simplicity of s((n,C). For each n, s\(n,C) is a simple Lie algebra. 


Proof. If X = (x;;) is any n x n matrix, then Xe;; has all columns zero 

except the jth, which is occupied by the ith column of X, and —e;;X has 

all rows zero except the ith, which is occupied by —(row j) of X. 
Therefore, since [X ,e;;] = Xe;; — e;;X, we have 


Xi 
Xi-1,i 
column j of [X,e;j] = | x#—x,jj |, 
Ni+1,i 
Xni 
and 
row i of [X,e;;] = (—xj1 vee TX G-1 Mii TXjZ Xft ee —Xjn) , 


and all other entries of [X,e;;] are zero. In the (i, j)-position, where the 
shifted row and column cross, we get the element xj; — x ;;. 

We now use such bracketing to show that an ideal 3 with a nonzero 
member X includes all the basis elements of sl(n,C), so 3 = sl(n,C). 


Case (i): X has nonzero entry xj for some i ¥ j. 


Multiply [X,e;;] by e;; on the right. This destroys all columns except 
the ith, whose only nonzero element is —x jj in the (i,i)-position, moving it 
to the (i, 7)-position (because column i is moved to column j position). 

Now multiply [X,e;;| by —e;; on the left. This destroys all rows except 
the jth, whose only nonzero element is xj; at the (j, j)-position, moving it 
to the (i, j)-position and changing its sign (because row j is moved to row 
i position, with a sign change). 

It follows that [X ,e;;]e;; — e;;[X ,e:;] = |[X, e:;],e:;] contains the nonzero 
element —2x jj at the (i, j)-position, and zeros elsewhere. 

Thus the ideal 3 containing X also contains e;;. By further bracket- 
ing we can show that all the basis elements of sl(n,C) are in 3. For 
a start, if e;; € J then e;; € J, because the calculation above shows that 
[[eij,e;i],e€ ji] = —2e;:. The other basis elements can be obtained by using 


the result 
Sf ek ifiFk, 
(e:,e8)={ ej;—e;; ifi=k, 
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which can be checked by matrix multiplication (Exercise 6.4.1). 
For example, suppose we have e;2 and we want to get e43. This is 
achieved by the following pair of bracketings, from right and left: 


[e12,€23] = e13, 


[e41,€13] = €43. 


All ex; with k £1 are obtained similarly. Once we have all of these, we 
obtain the remaining basis elements of sl(n,C) by 


[eras @ns| = Ci — Cnn. 


Case (ii). All the nonzero entries of X are among *x11,%22,...,Xnn- 


Not all these elements are equal (otherwise, Tr(X) 4 0), so we can 
choose i and j such that x;; —x;; #0. Now, for this X, the calculation of 
[X ,e;;] gives 

[X ej] = (wii — xj) eij- 
Thus 3 includes a nonzero multiple of e;;, and hence e;; itself. We can 


now repeat the rest of the argument in case (i) to conclude again that 3 = 
sl(n,C), so sl(n,C) is simple. 


An easy corollary of this result is the following: 
Simplicity of su(n). For each n, su(n) is a simple Lie algebra. 
Proof. We use the result from Section 5.6, that 
sl(n,C) = su(n) + isu(n) = {A +iB:A,B € su(n)}. 
It follows that if 3 is a nontrivial ideal of su(n) then 
ILM ={(C+iDiIC, DEF} 
is a nontrivial ideal of sl(n,C). One only has to check that 


1. 3+ 17 is not all of sl(n,C), which is true because of the 1-to-1 corre- 
spondence X = X; + iX> between elements X of sl(n,C) and ordered 
pairs (X1,X2) such that X1,X2 € su(n). 


If J+ i3 includes each X € sl(n,C) then 3 includes each X; € su(n), 
contrary to the assumption that 3 is not all of su(). 
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2. J+i3 is a vector subspace (over C) of sl(n,C). Closure under sums 
is obvious. And the scalar multiple (a+ ib)(C + iD) of any C+iD 
in 3+ iJ is also in 3+i3 for any a+ib € C because 


(a+ib)(C +iD) = (aC — bD) +i(bC + aD) 
and aC — bD,bC + aD € 3 by the vector space properties of J. 


3. 3+i3 is closed under the Lie bracket with any A+ iB € sl(n,C). 
This is because, if C+iD € 3+i9, then 


[C +iD,A + iB] = [C,A] —[D,B]+i((D,A] + [C,B]) € 3475 
by the closure properties of 3. 


Thus a nontrivial ideal 3 of su(7) gives a nontrivial ideal of s{(n,C). There- 
fore 3 does not exist. 


Exercises 
6.4.1 Verify that 
oS ee ifiFk, 
[eij,€ jx] ~~ { ejj—ej; ifi=k. 
6.4.2 More generally, verify that [e;;,ex] = 6jceu — Sex. 


In Section 6.6 we will be using multiples of the basis vectors €;,, by the quaternion 
units i, j, and k. Here is a taste of the kind of result we require. 


6.4.3 Show that [i(ep» — egq),5(€pp — aq)| = 2K(€pp + €gq)- 


6.4.4 Show that an ideal of quaternion matrices that includes ie,,,,, also includes 
Jemm and ken. 


6.5 Simplicity of so(v) for n > 4 


The Lie algebra so(n) of real n x n skew-symmetric matrices has a basis 
consisting of the n(n — 1) matrices 

Ej; = @jj — eji for i< ve 
Indeed, since Ej; has 1 in the (i, j)-position and —1 in the (j,i)-position, 
any skew symmetric matrix is uniquely expressible in the form 


x= > xii. 


i<j 
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Our strategy for proving that so(7) is simple is like that used in Section 6.4 
to prove that s{(n,C) is simple. It involves two stages: 


e First we suppose that X is a nonzero member of some ideal 3 and 
take Lie brackets of X with suitable basis vectors until we obtain a 


~ 


nonzero multiple of some basis vector in 3 


e Then, by further Lie bracketing, we show that all basis vectors are 
in fact in 3, so J = so(n). 


The first stage, as with sl(n,C), selectively nullifies rows and columns until 
only a nonzero multiple of a basis vector remains. It is a little trickier to 
do this for so(n), because multiplying by Ej; leaves intact two columns 
(or rows, if one multiplies on the left), rather than one. To nullify all but 
two, symmetrically positioned, entries we need n > 4, which is no surprise 
because $0(4) is not simple. 

In the first stage we need to keep track of matrix entries as columns 
and rows change position, so we introduce a notation that provides number 
labels to the left of rows and above columns. For example, we write 


to indicate that Ej; has | in the (i, j)-position, —1 in the (j,7)-position, and 
zeros elsewhere. 

Now suppose X is the n x n matrix with (i, j)-entry x;;. Multiplying X 
on the right by E;; and on the left by —E;;, we find that 


i J 
xj X1j 
XE;; = —X2; X2j 
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and 


Thus, right multiplication by E;; preserves only column i, which goes to 
position j, and column j, which goes to position i with its sign changed. 
Left multiplication by —E;; preserves row i, which goes to position j, and 
row j, which goes to position 7 with its sign changed. 

The Lie bracket of X with E;; is the sum of XE;; and —E;;X, namely 


[x ,Eij] = 
i J 
x1; Xj 
—x2; X9j 
L| —xj1 —Xj2 TX ji — Xij Xj j Xi —Xjn 
J| Xia Xi2 Xii — Xjj Xijp + Xji Xin 
—Xnj Xni 


Note that the (i, )- and (j,i)-entries are zero when X € so(n) because x; = 
xj; = 0 in a skew-symmetric matrix. Likewise, the (i,i)- and (j, j)-entries 
are zero for a skew-symmetric X, so for X € so(n) we have the simpler 
formula (*) below. In short, the rule for bracketing a skew-symmetric X 
with K;; is: 


e Exchange rows i and j, giving the new row 7 a minus sign. 
e Exchange columns i and j, giving the new column i a minus sign. 


e Put 0 where the new rows and columns meet and 0 everywhere else. 
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[X, Ei] = 
i J 
Hii Mi 
—X2; X2j 
l Xl —X j2 eee @) eee O or ar —Xjn (*) 
J Xi] Xi2 0 0 Xin 
Xn j Xni 


We now make a series of applications of formula (*) for [X,E;;] to 
reduce a given nonzero X € so(n) to a nonzero multiple of a basis vector. 
The result is the following theorem. 


Simplicity of so(n). For each n > 4, so(n) is a simple Lie algebra. 


Proof. Suppose that 3 is a nonzero ideal of so(n), and that X is a nonzero 
n X n matrix in J. We will show that J contains all the basis vectors E;;, so 
J =so0(n). 


In the first stage of the proof, we Lie bracket X with a series of four 
basis elements to produce a matrix (necessarily skew-symmetric) with just 
two nonzero entries. The first bracketing produces the matrix X, = [X, Kj] 
shown in (*) above, which has zeros everywhere except in columns i and j 
and rows i and j. 


For the second bracketing we choose ak 4 i, j and form X2 = [X),E dibs 
which has row and column j of X; moved to the k position, row and column 
k of X; moved to the / position with their signs changed, and zeros where 
these rows and columns meet. Row and column k in X; = [X,E;;] have 
at most two nonzero entries (where they meet row and column i and j), 
so row and column j in X2 = [X),E a] each have at most one, since the 
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(j,j)-entry —x;z — x, is necessarily zero. The result is that 


i j k 


hig Se ee <n ). me G see -ge 


Xni 


Now choose / # i, j,k and bracket Xz = [X,, Ej] with Ey. The only 
nonzero elements in row and column / of X2 are xj; at position (/,k) in row 
l and x; at position (k,/) in column k. Therefore, X3 = [X2, Ej] is given by 


1 =X 
[X2,Ea)]= j Xj 


k —Xil 


To complete this stage we choose m 4 i, j,k, ] and bracket X3 = [X2, Ej] 
with E;,,. Since row and column m are zero, the result X4 = [X3,Ej),,| is the 
matrix with x;; in the (j,m)-position and xj, in the (m, j)-position; that is, 


[X3 ; Em!| = XK jE jm : 


Now we work backward. If X is a nonzero member of the ideal J, let x;; 
be a nonzero entry of X. Provided n > 4, we can choose i # j,k, then 


132 6 Structure of Lie algebras 


1#i,j,k and m Fi, j,k,1, and construct the nonzero element x; jE jm of 3 
by a sequence of Lie brackets as above. Finally, we multiply by 1/x;; and 
obtain E jm € 3. 

The second stage obtains all other basis elements of so(n) by forming 
Lie brackets of Ej, with other basis elements. This proceeds exactly as 
for sl(n,C), because the Ej; satisfy relations like those satisfied by the e;;, 
namely 


[Eij,E4)]=Ex if ifk, 
[E;j,,Ex]=E,x if jk. 


Thus, when n > 4, any nonzero ideal of so(n) is equal to so(n), as 
required. 


The first stage of the proof above may seem a little complicated, but 
I doubt that it can be substantially simplified. If it were much simpler it 
would be wrong! We need to use five different values 7, j,k,/,m because 
s0(4) is not simple, so the result is false for a 4 x 4 matrix X. 


Exercises 
6.5.1 Prove that 
[Eij,E~“]=Ex if ik, 
[Eij,Ey]=Ej, if j#k. 
6.5.2 Also show that [E;;,E,;] = 0 if i, j,k,/ are all different. 


6.5.3 Use Exercises 6.5.1 and 6.5.2 to give another proof that [X3, Eym] = x4 jE jm. 
(Hint: Write X3 as a linear combination of E;, and Ej.) 


6.5.4 Prove that each 4 x 4 skew-symmetric matrix is uniquely decomposable as 


asum 
0 -a -—b -c QO -x -y -z 
a 0 -c bD 4 x O Z —y 
b ¢ QO —-a y —z O x 
c —-b a 0) Z y -x O 


6.5.5 Setting 1 = —E)7 — E34, J = —E,3 + Exg, and K = —E,4 — Ey3, show that 
7, J] => 2K, (J, K] => 21, and [K,]| = QS. 


6.5.6 Deduce from Exercises 6.5.4 and 6.5.5 that s0(4) is isomorphic to the direct 
product s0(3) x s0(3) (also known as the direct sum and commonly written 
s0(3) @$0(3)). 
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6.6 Simplicity of sp(n) 


If X € sp(n) we have X +X TX. 0, where X is the result of replacing each 
entry in the matrix X by its quaternion conjugate. Thus, if X = (x;;) and 


xij = aij + bi i+ ci + dijk, 


then 
Xij = aij — bil — cijj — dij 
and hence 
Xfi = — Ajj + Dijit cijj+dijk, 

where aj;,b;;,cij,dij © IR. (And, of course, the quaternion units i, j, and 
k are completely unrelated to the integers i, j used to number rows and 
columns.) In particular, each diagonal entry x;; of X is pure imaginary. 

This gives the following obvious basis vectors for sp(m) as a vector 
space over R. The matrices e;; and Kj; are as in Sections 6.4 and 6.5. 


e Fori=1,2,...,n, the matrices ie,;, je;;, and ke;. 
e For each pair (i, 7) with i < j, the matrices Ej. 


e For each pair (i, j) with i < j, the matrices iE;;, jEi;. and kE;;, where 
E;; is the matrix with 1 in the (i, /)-position, 1 in the (j,i)-position, 
and zeros elsewhere. 


To prove that sp(7) is simple we suppose that 3 is an ideal of sp(m) with 
a nonzero element X = (x;;). Then, as before, we reduce X to an arbitrary 
basis element by a series of Lie bracketings and vector space operations. 
Once we have found all the basis elements in 3, we know that 3 = sp(n). 
We have a more motley collection of basis elements than ever before, but 
the job of finding them is made easier by the presence of the very simple 
basis elements ie;;, je;;, and ke;;. 

In particular, J includes 


(*) 
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and hence also, if i # j, 


, : Ly; 1 
[[X ie] ie jj] = " 


j|  —ixji 


where all entries are zero except those explicitly shown. 
This gets the essential matrix calculations out of the way, and we are 
ready to prove our theorem. 


Simplicity of sp(n). For all n, sp(n) is a simple Lie algebra. 


Proof. When n = 1, we have sp(1) = su(2), which we proved to be simple 
in Section 6.4. Thus we can assume n > 2, which allows us to use the 
computations above. 

Suppose that 3 is an ideal of sp(n), with a nonzero element X = (x;;). 


Case (i). All nonzero entries x;; of X are on the diagonal. 


In this case (*) gives the element of 3 
[X ,ieji] = (xi — ivi eis, 
and we can similarly obtain the further elements 


[XJe3| = eal — eens 
[X , keji] = (xk — kxji)eii- 


Now if x; = Dji+ cij + djjk we find 


Xi — Ley = —2cyik + 2diij, 
Xi) — Jit = 2D — 2d;il, 
Xiik — kexjg = —2Dj3j + 2ciil. 


So, by the closure of 3 under Lie brackets and real multiples, we have 
(—cik+dijjei, (bik —dii)ew, (—bijtciijex in J. 
Lie bracketing these three elements with k1, il, j1 respectively gives us 


die, bijei, cikej; im J. 
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Thus if x; is a nonzero entry in X we have at least one of the basis vectors 
ie;;, je, ke;; in 3. Lie bracketing the basis vector in 3 with the other two 
then gives us all three of ie;;, je;;, ke; in 3. (Here, the facts that jk = —kj =i 
and so on work in our favor.) 

Until now, we have found ie;;, je;;, ke;; in 3 only for one value of i. To 
complete our collection of diagonal basis vectors we first note that 


[E;,ie] =iF;;, [Ej;,jex]=jEij;, [Ei;,kei] = kE;,, (*) 


as special cases of the formula (*). Thus we have 


iE; ;, jE; ;, kE;;, in 3 
for some i and arbitrary j 4 i. Then we notice that 
(iE ;;, jE; = 2k(ej; a re 


So k(ej + ;;) and ke; are both in 3, and hence their difference ke; is in 
J, for any j. We then find ie;; in 3 by Lie bracketing je;; with ke;;, and 
je;; in 3 by Lie bracketing ie;; with ke;;. 

Now that we have the diagonal basis vectors ie;;, je;;, ke; in 5 for all 
i, we can reapply the formulas (**) to get the basis vectors iE; is jE; j, and 
kE;; for all i and j with i < j. Finally, we get all the K;; in 9 by the formula 


[iE ;;,ie:;] = E;;, 


which also follows from (*). Thus all the basis vectors of sp(m) are in J, 
and hence 3 = sp(n). 


Case (ii). X has a nonzero entry of the form x;; = aj; + bjji+-cijj+dijk, 
for some i < j. 


Our preliminary calculations show that the element [[X , ie;i],ie;;| of 3 
has zeros everywhere except for —ix;;i in the (i, j)-position, and its nega- 
tive conjugate —ix;;i in the (j,7)-position. Explicitly, the (i, j)-entry is 

ix; i = aj; + biji— cijj — dijk, 


so we have 


[[X ,ie:i],ie;;] = aii; + (biji—cijj — dijk)Ej; € 3. 
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If aj; is the only nonzero coefficient in [[X ,ie;|,ie;;]| we have Ej; € 3. 
Then, writing E;; = e;; — e;i, Ej; =e; +e;;, we find from the formula 
[eij,€ ji] = ei — ej; of Section 6.4 the following elements of 3: 


[E;;,iF;/] = 2i(e;; = e;;), 
(Bi j,JEij] = 2i(€u ~e 7) 
[Bij KE,j] = 2k (ex — €3,). 


The first two of these elements give us 
li(eii — e;;) j(eu — €j;)] = 2k(ex + e;;) € F 


(Another big “thank you” to noncommutative quaternion multiplication!) 
Adding the last two elements found, we find ke;; € 3, so J = sp(n) for the 
same reasons as in Case (i). 

Finally, if one of the coefficients b;;, c;j, or djj is nonzero, we simplify 
ahi + (biA—cyj = d;jk)E;; by Lie bracketing with i1, j1, and k1. Since 


[E,j,i1]=0, [ik;;,i1]=0, [iE;;,j1] = 2kE;,, 


and so on, we can nullify all terms in a;jEj; + (biji — cijj+ d;jk)E;; except 
one with a nonzero coefficient. This gives us, say, iE;; € J. Then we apply 
the formula 

[iE:;;,ie;;) = E;;, 


which follows from (*), and we again have E;; € J, so we can reduce to 
Case (i) as above. 


Exercises 


It was claimed in Section 5.7 that sp(m) is “simpler” than the Lie algebra gl(n, Hl) 
of all n x n quaternion matrices. What was meant is that gl(n,H) is not a simple 
Lie algebra—it contains two nontrivial ideals: 


R= {X:X=rl1forsomer€R}_ of dimension 1, 
& = {X :re(Tr(X)) =0} of dimension 4n” — 1, 
where re denotes the real part of the quaternion. 
6.6.1 Prove that % is an ideal of gl(n,H). 


6.6.2 Prove that, for any two quaternions p and q, re(pq) = re(qp). 
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6.6.3 Using Exercise 6.6.2 or otherwise, check that { is an ideal of the real Lie 
algebra gl(n, Hl). 


6.6.4 Show that each X € gl(n,H) has a unique decomposition of the form X = 
R+T, where RE Rand T € f. 


It turns out that 9% and & are the only nontrivial ideals of gl(n,H]). This can be 
shown by taking the An? basis vectors e;;, ie;;, jei;, kei; for gl(n,H), and consid- 
ering a nonzero ideal 9. 


6.6.5 If J has a member X with a nonzero entry x;;, where i # j, show that 3 
equals T or gl(n, H). 


6.6.6 Show in general that 3 equals either KR, T, or gl(n, H). 


6.7 Discussion 


As mentioned in Section 5.8, the classical simple Lie algebras were known 
to Lie in the 1880s, the exceptional simple algebras were discovered by 
Killing soon thereafter, and by 1894 Cartan had completely settled the 
question by an exhaustive proof that they are the only exceptions. The 
number of exceptional algebras, in complex form, is just five. All this be- 
fore it was realized that Lie algebras are quite elementary objects! (namely, 
vector spaces of matrices closed under the Lie bracket operation). It has 
been truly said that the Killing—Cartan classification of simple Lie alge- 
bras is one of the great mathematical discoveries of all time. But it is not 
necessary to use the sophisticated theory of “root systems,” developed by 
Killing and Cartan, merely to prove that the classical algebras so(n), su(n), 
and sp(n) are simple. As we have shown in this chapter, elementary matrix 
calculations suffice. 

The matrix proof that sl(m,C) is simple is sketched in Carter et al. 
[1995], p. 10, and the simplicity of su(n) follows from it, but I have 
nowhere seen the corresponding elementary proofs for so(n) and sp(n). 
It is true that the calculations become a little laborious, but it is not a good 
idea to hide all matrix calculations. Many results were first discovered 
because somebody did such a calculation. 

The simplicity proofs in Sections 6.4 to 6.6 are trivial in the sense that 
they can be discovered by anybody with enough patience. Given that sp(n), 
say, is simple, we know that the ideal generated by any nonzero element 
X is the whole of sp(n). Therefore, if we apply enough Lie bracket and 
vector space operations to X, we will eventually obtain all the basis vectors 
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of sp(n). In other words, brute force search gives a proof that any nonzero 
ideal of sp(n) equals sp(n) itself. 

The Lie algebra s0(4) is close to being simple, because it is the direct 
product so0(3) x so(3) of simple Lie algebras. Direct products of simple 
Lie algebras are called semisimple. Sophisticated Lie theory tends to focus 
on the broader class of semisimple Lie algebras, where s0(4) is no longer 
an anomaly. With this approach, one can also avoid the embarrassment of 
using the term “complex simple Lie algebras” for algebras such as sl(n,C), 
replacing it by the slightly less embarrassing “complex semisimple Lie al- 
gebras.” (Of course, the real mistake was to call the imaginary numbers 
“complex” in the first place.) 


7 


The matrix logarithm 


PREVIEW 


To harness the full power of the matrix exponential we need its inverse 
function, the matrix logarithm function, log. Like the classical log, the 
matrix log is defined by a power series that converges only in a certain 
neighborhood of the identity. This makes results involving the logarithm 
more “local” than those involving the exponential alone, but in this chapter 
we are interested only in local information. 

The central result is that log and exp give a 1-to-1 correspondence, 
continuous in both directions, between a neighborhood of 1 in any matrix 
Lie group G and a neighborhood of 0 in its Lie algebra g = %(G). Thus 
the log function produces tangents. The proof relates the classical limit 
process defining tangents to the infinite series defining the logarithm. The 
need for limits motivates the definition of a matrix Lie group as a matrix 
group that is suitably closed under limits. 

The correspondence shows that elements of G sufficiently close to 1 
are all of the form e*, where X € g. When two such elements, eX and e’, 
have a product of the form e% it is natural to ask how Z is related to X and 
Y. The answer to this question is the Campbell—Baker—Hausdorff theorem, 
which says that Z equals an infinite sum of elements of the Lie algebra g, 
namely X + Y plus elements built from X and Y by Lie brackets. 

We give a very elementary, but little-known, proof of the Campbell— 
Baker—Hausdorff theorem, due to Eichler. The proof depends entirely on 
manipulation of polynomials in noncommuting variables. 
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7.1 Logarithm and exponential 


Motivated by the classical infinite series 


2 a x4 


log(1 +x) =x-S45-Fte, valid for real x with |x| < 1, 


we define the logarithm of a square matrix 1+ A with |A| < 1 by 


log(1+A) =A-—+—-—+4+:::. 


This series is absolutely convergent (by comparison with the geometric se- 
ries) for |A| < 1, and hence log(1+A) is a well-defined continuous function 
in this neighborhood of 1. 

The fundamental property of the matrix logarithm is the same as that 
of the ordinary logarithm: it is the inverse of the exponential function. 
The proof involves a trick we used in Section 5.2 to prove that e4e? = 
e“+® when AB = BA. Namely, we predict the result of a computation with 
infinite series from knowledge of the result in the real variable case. 


Inverse property of matrix logarithm. For any matrix e* within distance 
1 of the identity, 
log(e*) =X. 


Proof. Since eX =14 44444... and |e“ —1| < 1 we can write 


x x 
log(e*) =log (14 (F+F+-)) 


_(X,%, eae San aul Xx ° 
~\dto 2! Vit DI 3\1! 2! 


by the definition of the matrix logarithm. Also, the series is absolutely 
convergent, so we can rearrange terms so as to collect all powers of X” 
together, for each m. This gives 


| | re 
losie (=x (eX 4 | eS ee 
ote’) +(5 5) +(5 +5) + 
It is hard to describe the terms that make up the coefficient of X”’, for 
arbitrary m > 1, but we know that their sum is zero! Why? Because exactly 
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the same terms occur in the expansion of log(e*), when |e* — 1| < 1, and 
their sum is zero because log(e*) = x under these conditions. 
Thus log(e*) = X as required. 


The inverse property allows us to derive certain properties of the matrix 
logarithm from corresponding properties of the matrix exponential. For 
example: 


Multiplicative property of matrix logarithm. /f AB = BA, and log(A), 
log(B), and log(AB) are all defined, then 


log(AB) = log(A) + log(B). 


Proof. Suppose that log(A) = X and log(B) = Y, so eX =A and e = B by 
the inverse property of log. Notice that XY = YX because 


A=1 .(A=1P 
(a=1? A= 
2 3 
B=1)?. (B=1) 
@-1) @-1) 
2 3 
and the series commute because A and B do. Thus it follows from the 
addition formula for exp proved in Section 5.2 that 


X =log(1+ (A—1)) = (A-1)— 


Y = log(1+ (B-1)) = (B-1)- 


AB= ee =F. 
Taking log of both sides of this equation, we get 
log(AB) = X + Y = log(A) + log(B) 


by the inverse property of the matrix logarithm again. 


Exercises 


The log series 


2 x x4 


x 
log(1 Sy ce 
og(1+x) =x 5 + 774 + 

was first published by Nicholas Mercator in a book entitled Logarithmotechnia in 


1668. Mercator’s derivation of the series was essentially this: 
x dt x 2 3 x2 ro x4 

log(1 =/——= 1-t+?r-r4+-:-)dt=x-~-+-—-— 

oat] +) oar A ee el EES a gag 
Isaac Newton discovered the log series at about the same time, but took the idea 
further, discovering the inverse relationship with the exponential series as well. 
He discovered the exponential series by solving the equation y = log(1 +.) as 
follows. 


142 7 The matrix logarithm 


7.1.1 Supposing x = a9 +a,y+azy* +--- (the function we call e” — 1), show that 


y= (an+ary+ ary’ +++) 
1 
—5(a0 + ary + any* +--+)? 


1 
+3(a0+ a1y+ ay? +---)? ee al] 


7.1.2 By equating the constant terms on both sides of (*), show that ag = 0. 
7.1.3. By equating coefficients of y on both sides of (*), show that a; = 1. 
7.1.4 By equating coefficients of y? on both sides of (*), show that ay = 1/2. 


7.1.5 See whether you can go as far as Newton, who also found that a3 = 1/6, 
a4 = 1/24, and as = 1/120. 


Newton then guessed that a, = 1/n! “by observing the analogy of the series.” 
Unlike us, he did not have independent knowledge of the exponential function 
ensuring that its coefficients follow the pattern observed in the first few. 

As with exp, term-by-term differentiation and series manipulation give some 
familiar formulas. 


7.1.6 Prove that 4log(1+Ar) =A(1+Ar)~!. 


7.2 The exp function on the tangent space 


For all the groups G we have seen so far it has been easy to find a general 
form for tangent vectors A’(0) from the equation(s) defining the members 
A of G. We can then check that all the matrices X of this form are mapped 
into G by exp, and that e* lies in G along with e*, in which case X is a 
tangent vector to G at 1. Thus exp solves the problem of finding enough 
smooth paths in G to give the whole tangent space 7;(G) = g. 

But if we are not given an equation defining the matrices A in G, we 
may not be able to find tangent matrices in the form A’(0) in the first place, 
so we need a different route to the tangent space. The log function looks 
promising, because we can certainly get back into G by applying exp to a 
value X of the log function, since exp inverts log. 

However, it is not clear that log maps any part of G into T;(G), except 
the single point 1 € G. We need to make a closer study of the relation 
between the limits that define tangent vectors and the definition of log. 
This train of thought leads to the realization that G must be closed under 
certain limits, and it prompts the following definition (foreshadowed in 
Section 1.1) of the main concept in this book. 
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Definition. A matrix Lie group G is a group of matrices that is closed 
under nonsingular limits. That is, if Aj,A2,A3,... is a convergent sequence 
of matrices in G, with limit A, and if det(A) 4 0, then A € G. 


This closure property makes possible a fairly immediate proof that exp 
indeed maps 7;(G) back into G. 


Exponentiation of tangent vectors. /f A’(0) is the tangent vector at 1 to 
a matrix Lie group G, then e' €G. That is, exp maps the tangent space 
T1(G) into G. 


Proof. Suppose that A(t) is a smooth path in G such that A(0) = 1, and 
that A’(0) is the corresponding tangent vector at 1. By definition of the 
derivative we have 


where n takes all natural number values greater than some ng. We compare 
this formula with the definition of log A(1/n), 


y=? iat 
logA(1/n) = (A(1/n)~1)-ACAI= DT, ACI... 


which also holds for natural numbers n greater than some no. Dividing 
both sides of the log formula by 1/n we get 


nlogA(1/n) = ee 
AD Ay EAS AO | ps 
1/n 1/n 2 3 . 


Now, taking no large enough that |A(1/n) —1| < € < 1/2, the series in 
square brackets has sum of absolute value less than ¢ + €7 + €?+--- <2e, 
so its sum tends to 0 as n tends to . It follows that the right side of (*) has 
the limit 

A'(0) —A"(0)[0] = A“(0) 


as n — ce, The left side of (*), nlogA(1/n), has the same limit, so 


A’(0) = lim nlogA(1/n). (*) 
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Taking exp of equation (**), we get 


ef (0) = elimnnlogA(1/n) 


= lim e” log A(1/n) 


n—-coo 


because exp is continuous 

n 

= lim (elveatt/ ")) because e418 = e4e? when AB = BA 
n—-co 


= lim A(1/n)" because exp is the inverse of log. 
n—co 


Now A(1/n) € G by assumption, so A(1/n)” € G because G is closed under 
products. We therefore have a convergent sequence of members of G, and 
its limit e4’ is nonsingular because it has inverse e 4. So AO EG, 
by the closure of G under nonsingular limits. 

In other words, exp maps the tangent space 74(G) = g into G. 


The proof in the opposite direction, from G into 7)(G), is more subtle. 
It requires a deeper study of limits, which we undertake in the next section. 


Exercises 


7.2.1 Deduce from exponentiation of tangent vectors that 


T(G) ={X: e* €G forallt € R}. 


The property 7)(G) = {X : e* € G for all t € R} is used as a definition of T(G) 
by some authors, for example Hall [2003]. It has the advantage of making it clear 
that exp maps 7;(G) into G. On the other hand, with this definition, we have to 
check that T)(G) is a vector space. 


7.2.2. Given X as the tangent vector to e’*, and Y as the tangent vector to e”, 
show that X +¥ is the tangent vector to A(t) = ee”. 


7.2.3 Similarly, show that if X is a tangent vector then so is rX for any r € R. 


The formula A’(0) = lim,_...n log A(1/n) that emerges in the proof above can 
actually be used in two directions. It can be used to prove that exp maps 7;(G) 
into G when combined with the fact that G is closed under products (and hence 
under nth powers). And it can be used to prove that log maps (a neighborhood of 
1 in) G into 7)(G) when combined with the fact that G is closed under nth roots. 

Unfortunately, proving closure under nth roots is as hard as proving that log 
maps into 7)(G), so we need a different approach to the latter theorem. Never- 
theless, it is interesting to see how nth roots are related to the behavior of the log 
function, so we develop the relationship in the following exercises. 
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7.2.4 Suppose that, for each A in some neighborhood .¥ of 1 in G, there is a 
smooth function A(t), with values in G, such that A(1/n) = A!/" for n= 
1,2,3,.... Show that A’(0) = logA, so logA € T%](G). 


7.2.5 Suppose, conversely, that log maps some neighborhood .¥ of 1 in G into 
T(G). Explain why we can assume that ./ is mapped by log onto an 
e€-ball N.(0) in %](G). 


7.2.6 Taking -V as in Exercise 7.2.4, and A € -V, show that rlogA € 7%](G) for 
all t € [0,1], and deduce that A!/” exists for n = 1,2,3,.... 


7.3 Limit properties of log and exp 


In 1929, von Neumann created a new approach to Lie theory by confin- 
ing attention to matrix Lie groups. Even though the most familiar Lie 
groups are matrix groups (and, in fact, the first nonmatrix examples were 
not discovered until the 1930s), Lie theory began as the study of general 
“continuous” groups and von Neumann’s approach was a radical simplifi- 
cation. In particular, von Neumann defined “tangents” prior to the concept 
of differentiability—going back to the idea that a tangent vector is the limit 
of a sequence of “chord” vectors—as one sees tangents in a first calculus 
course (Figure 7.1). 


A 


Ad 
A3 


P 


Figure 7.1: The tangent as the limit of a sequence. 


Definition. X is a sequential tangent vector to G at 1 if there is a sequence 
(Am) of members of G, and a sequence (0) of real numbers, such that 
Am — Land (Am —1)/Qm — X asm > ~., 
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If A(t) is a smooth path in G with A(0) = 1, then the sequence of points 
Am = A(1/m) tends to 1 and 


A'(0) = lim oak 
m—ceo 1/ m 
so any ordinary tangent vector A’(0) is a sequential tangent vector. But 
sometimes it is convenient to arrive at tangent vectors via sequences rather 
than via smooth paths, so it would be nice to be sure that all sequential 
tangent vectors are in fact ordinary tangent vectors. This is confirmed by 
the following theorem. 


Smoothness of sequential tangency. Suppose that (Am) is a sequence in 
a matrix Lie group G such that An — 1 as m — ©, and that (On) is a 
sequence of real numbers such that (Am —1)/Om — X as m — . 

Then eX € G for all real t (and therefore X is the tangent at 1 to the 
smooth path e!* ). 


Proof. Let X = lim,,... a 1 First we prove that e* € G. Then we indicate 


how the proof may be modified to show that e € G. 
Given that (A, —1)/Q, — X as m — , it follows that %, — 0 as 
Am — 1, and hence 1/0, — ce. Then if we set 


dm = Nearest integer to 1/Qm, 
we also have am(Am— 1) — X as m — ©. Since a, is an integer, 


log(A%") = ay log(Am) by the multiplicative property of log 
Am—-1 (Am—1)? 
2 3 


= an(Am— 1) — dm(Am—1) +... 


And since A,, — 1 we can argue as in Section 7.2 that the series in square 
brackets tends to zero. Then, since lim, ,.>@n(Am — 1) = X, we have 
X = lim log(A@”). 
m— oo 
It follows, by the inverse property of log and the continuity of exp, that 


XY: a 
e° = lim Aj”. 


m—oo 


Since da, is an integer, A‘ € G by the closure of G under products. And 


then, by the closure of G under nonsingular limits, 


& = lim A €G. 
m—eo 
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To prove that e’* € G for any real ft one replaces 1/0, in the above 
argument by t/Qn. If 


bm = nearest integer to t/Om, 


we similarly have by,(Am—1) — tX as m— ©. And if we consider the 
series for 
log (A?) = Dn log (Am) 


mn 


we similarly find that 
eX — lim Ab €G 
m—-oo 


by the closure of G under nonsingular limits. 


This theorem is the key to proving that log maps a neighborhood of 1 in 
G onto a neighborhood of 0 in 7;(G), as we will see in the next section. It 
is also the core of the result of von Neumann [1929] that matrix Lie groups 
are “smooth manifolds.” We do not define or investigate smooth manifolds 
in this book, but one can glimpse the emergence of “smoothness” in the 
passage from the sequence (A,,) to the curve e in the above proof. 


Exercises 


Having proved that sequential tangents are the same as the smooth tangents we 
considered previously, we conclude that sequential tangents have the real vector 
space properties. Still, it is interesting to see how the vector space properties 
follow from the definition of sequential tangent. 


7.3.1 If X and Y are sequential tangents to a group G at 1, show that X + Y is also. 


7.3.2 If X is a sequential tangent to a group G at 1, show that rX is also, for any 
real number r. 
7.4 The log function into the tangent space 
By a “neighborhood” of 1 in G we mean a set of the form 
N5(1) = {A €G: |A-1| < 5}, 


where |B| denotes the absolute value of the matrix B, defined in Section 4.5. 
We also call N5(1) the 6-neighborhood of 1. Then we have the following 
theorem. 
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The log of a neighborhood of 1. For any matrix Lie group G there is a 
neighborhood Ns(1) mapped into Ty(G) by log. 


Proof. Suppose on the contrary that no Ns(1) is mapped into 71(G) by log. 
Then we can find Aj,A2,A3,... € G with A,, — 1 as m — ©, and with each 
logAn ¢ T1(G). 
Of course, G is contained in some M,,(C). So each logA,, is in M,(C) 
and we can write 
logAm =Xin+Yn, 


where X,, is the component of logA,, in 7)(G) and Y,, 4 0 is the component 
in T,(G)*, the orthogonal complement of 7)(G) in M,(C). We note that 
Xm;Ym — 0 as m — © because A;, — 1 and log is continuous. 

Next we consider the matrices ¥,/|Ym| € T1(G)+. These all have ab- 
solute value 1, so they lie on the sphere .Y of radius 1 and center 0 in 
M,,(C). It follows from the boundedness of .¥ that the sequence (Yin/|Yin|) 
has a convergent subsequence, and the limit Y of this subsequence is also 
a vector in 7{(G)+ of length 1. In particular, Y ¢ T(G). 

Taking the subsequence with limit Y in place of the original sequence 
we have 


Finally, we consider the sequence of terms 
Tn = e An: 


Each T,, € G because —X,, € Ty(G); hence e~*" € G by the exponentiation 
of tangent vectors in Section 7.2, and A,, € G by hypothesis. On the other 
hand, A, = eX" + by the inverse property of log, so 


Li, _ e Xm eXm+¥n 


x x47, 
= (1-44) (144 Xp + ty + SE YJ 


=1+Y,,+ higher-order terms. 


Admittedly, these higher-order terms include X7, and other powers of Xm, 
that are not necessarily small in comparison with Y,,. However, these pow- 
ers of X,,, are those in 


— »—Xm ps 
l=e me mh 
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so they sum to zero. (I thank Brian Hall for this observation.) Therefore, 
fa = tig =. 


m—sco [Yirg | m—sco [Yi | 


Since each T,;, € G, it follows that the sequential tangent 


is in 71(G) by the smoothness of sequential tangents proved in Section 7.3. 

But Y ¢ 7%(G), as observed above. This contradiction shows that our 
original assumption was false, so there is a neighborhood Ns(1) mapped 
into T;(G) by log. 


Corollary. The log function gives a bijection, continuous in both direc- 
tions, between N5(1) in G and log N5(1) in T1(G). 


Proof. The continuity of log, and of its inverse function exp, shows that 
there is a 1-to-1 correspondence, continuous in both directions, between 
N5(1) and its image log N5(1) in 74(G). 


If Ns(1) in Gis mapped into 7;(G) by log, then each A € N5(1) has the 
form A = e*, where X = logA € 7;(G). Thus the paradise of SO(2) and 
SU(2)—where each group element is the exponential of a tangent vector— 
is partly regained by the theorem above. Any matrix Lie group G has at 
least a neighborhood of 1 in which each element is the exponential of a 
tangent vector. 

The corollary tells us that the set log 5(1) is a “neighborhood” of 0 
in 7(G) in a more general sense—the topological sense—that we will 
discuss in Chapter 8. The existence of this continuous bijection between 
neighborhoods finally establishes that G has a topological dimension equal 
to the real vector space dimension of 7;(G), thanks to the deep theorem 
of Brouwer [1911] on the invariance of topological dimension. This gives 
a broad justification for the Lie theory convention, already mentioned in 
Section 5.5, of defining the dimension of a Lie group to be the dimension 
of its Lie algebra. In practice, arguments about dimension are made at the 
Lie algebra level, where we can use linear algebra, so we will not actually 
need the topological concept of dimension. 


Exercises 


The continuous bijection between neighborhoods of 1 in G and of 0 in %](G) 
enables us to show the existence of nth roots in a matrix Lie group. 
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7.4.1 Show that each A € Ns (1) has a unique nth root, for n = 1,2,3,.... 


7.4.2 Show that the 2 x 2 identity matrix 1 has two square roots in SO(2), but that 
one of them is “far” from 1. 


7.5 SO(n), SU(n), and Sp(n) revisited 


In Section 3.8 we proved Schreier’s theorem that any discrete normal sub- 
group of a path-connected group lies in its center. This gives us the discrete 
normal subgroups of SO(n), SU(), and Sp(7), since the latter groups are 
path-connected and we found their centers in Section 3.7. What remains is 
to find out whether SO(n), SU(m), and Sp(n) have any nondiscrete normal 
subgroups. We claimed in Section 3.9 that the tangent space would enable 
us to see any nondiscrete normal subgroups, and we are finally in a position 
to explain why. 

For convenience we assume a plausible result that will be proved rigor- 
ously in Section 8.6: if Ns(1) is a neighborhood of 1 in a path-connected 
group G, then any element of G is a product of members of Ns(1). We say 
that Ns5(1) generates the whole group G. With this assumption, we have 
the following theorem. 


Tangent space visibility. [f G is a path-connected matrix Lie group with 
discrete center and a nondiscrete normal subgroup H, then T(H) # {0}. 


Proof. Since the center Z(G) of G is discrete, and H is not, we can find a 
neighborhood N5(1) in G that includes elements B 4 1 in H but no member 
of Z(G) other than 1. If B £ 1 is a member of H in Ng(1), then B does not 
commute with some A € Ng(1). If B commutes with all elements of N5(1) 
then B commutes with all elements of G (because Ns(1) generates G), so 
B € Z(G), contrary to our choice of Ng(1). 

By taking 6 sufficiently small we can ensure, by the theorem of the 
previous section, that A = e* for some X € 7;(G). Indeed, we can ensure 
that the whole path A(t) = e* is in Ns(1) for0 <t <1. 

Now consider the smooth path C(t) = e* Be B~!, which runs from 
1 to eX Be-*B-! = ABA~'B™! in G. A calculation using the product rule 
for differentiation (exercise) shows that the tangent vector to C(t) at 1 is 


C'(0) =X —Bxe'. 


Since H is a normal subgroup of G, and B € H, we have e’* Be" € H. 
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Then e Be XB"! € H as well, so C(t) is in fact a smooth path in H and 
C'(0) =X —BXB"!e€1%(H). 


Thus to prove that 7;(H) 4 {0} it suffices to show that X —BXB-! 40. 
Well, 


X—BXB '!=0=>BXB '!=X 


—1 
BXB-' _ Xx 


>e 
= BeXB-! = &* 
=> Be* = e*B 
=> BA=AB, 
contrary to our choice of A and B. 
This contradiction proves that Tj(H) 4 {0}. 


Corollary. [f H is a nontrivial normal subgroup of G under the conditions 
above, then Ty(H) is a nontrivial ideal of T,(G). 


Proof. We know from Section 6.1 that 7)(H) is an ideal of 7(G), and 
T,(H) # {0} by the theorem. 

If 7%1(H) = %(G) then H fills Ns(1) in G, by the log-exp bijection 
between neighborhoods of the identity in G and 7;(G). But then H =G 
because G is path-connected and hence generated by Ns(1). Thus if H 4G, 
then T,(H) t T,(G). 


It follows from the theorem that any nondiscrete normal subgroup 1 
of G = SO(n),SU(n),Sp(n) gives a nonzero ideal 7;(H) in T(G). The 
corollary says that 7j(H) is nontrivial, that is, 71(H) 4 %(G) if H AG. 
Thus we finally know for sure that the only nontrivial normal subgroups of 
SO(n), SU(n), and Sp(n) are the subgroups of their centers. (And hence 
all the nontrivial normal subgroups are finite cyclic groups.) 


SO(3) revisited 


In Section 2.3 we showed that SO(3) is simple—the result that launched 
our whole investigation of Lie groups—by a somewhat tricky geometric 
argument. We can now give a proof based on the easier facts that the center 
of SO(3) is trivial, which was proved in Section 3.5 (also in Exercises 3.5.4 
and 3.5.5), and that so(3) is simple, which was proved in Section 6.1. The 
hard work can be done by general theorems. 
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By the theorem in Section 3.8, any discrete normal subgroup of SO(3) 
is contained in Z(SO(3)), and hence is trivial. By the corollary above, 
and the theorem in Section 6.1, any nondiscrete normal subgroup of SO(3) 
yields a nontrivial ideal of s0(3), which does not exist. 


Exercises 
7.5.1 If C(t) =e Be~ B~!, check that C’(0) = X — BXB™!. 


7.5.2 Give an example of a connected matrix Lie group with a nondiscrete normal 
subgroup H such that 7 (H) = {0}. 


7.5.3 Prove that U(m) has no nontrivial normal subgroup except Z(U(n)). 


7.5.4 The tangent space visibility theorem also holds if G is not path-connected. 
Explain how to modify the proof in this case. 


7.6 The Campbell—Baker—Hausdorff theorem 


The results of Section 7.4 show that, in some neighborhood of 1, any two 
elements of G have the form e* and e’ for some X,Y in g, and that the 
product of these two elements, e* e”, is e” for some Z in g. The Campbell— 
Baker—Hausdorff theorem says that more than this is true, namely, the Z 
such that eX eY = e® is the sum of a series X +Y+ Lie bracket terms com- 
posed from X and Y. In this sense, the Lie bracket on g “determines” the 
product operation on G. 

To give an inkling of how this theorem comes about, we expand e* 
and e” as infinite series, form the product series, and calculate the first few 
terms of its logarithm, Z. By the definition of the exponential function we 
have 

Z Re. ye 3° 


a4 bic Y_4 
a et ap te tor ae 


and therefore 


x  ¥? yeye 
Mey =14+X+Y4XY+—4+—4+- 
2! 2! min! 


with a term for each pair of integers m,n > 0. It follows, since 


Ww? w? wt 
log(1 —Ww-—-—+—— — +... 
og(1+W)=W 5 + : 7 free, 
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that 


oe xy? 
Z=log(ee ) = LEN eee at ae he 


1 x2 y2 : 
ered as i, 9 eo a Olle re 
s(x+¥+ Hat ort ) 


1 x2 y2 2 
=| ORY Re es 
+5 (x+¥+ gag ) 


1 1 
=X+Y+ a Y— ae + higher-order terms 


1 
=X+Y+ 5 [X,Y] + higher-order terms. 


The hard part of the Campbell-Baker-Hausdorff theorem is to prove that 
all the higher-order terms are composed from X and Y by Lie brackets. 

Campbell attempted to do this in 1897. His work was amended by 
Baker in 1905, with further corrections by Hausdorff producing a com- 
plete proof in 1906. However, these first proofs were very long, and many 
attempts have since been made to derive the theorem with greater economy 
and insight. Modern textbook proofs are typically only a few pages long, 
but they draw on differentiation, integration, and specialized machinery 
from Lie theory. 

The most economical proof I know is one by Eichler [1968]. It is only 
two pages long and purely algebraic, showing by induction on n that all 
terms of order n > | are linear combinations of Lie brackets. The algebra 
is very simple, but ingenious (as you would expect, since the theorem is 
surely not trivial). In my opinion, this is also an insightful proof, showing 
as it does that the theorem depends only on simple algebraic facts. I present 
Eichler’s proof, with some added explanation, in the next section. 


Exercises 


7.6.1 Show that the cubic term in log(e*e” ) is 
| 4A 2 24 y2 
Ty ed + XY? + YX? + 97K — 2XYX — 2YXY). 


7.6.2 Show that the cubic polynomial in Exercise 7.6.1 is a linear combination of 
[X, [X,Y] and [Y, [Y, x]. 
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The idea of representing the Z in e” = eXe” by a power series in noncommuting 
variables X and Y allows us to prove the converse of the theorem that XY = YX 
implies e* eY = e*+", 


7.6.3 Suppose that eX e’ = e’ eX. By appeal to the proof of the log multiplicative 
property in Section 7.1, or otherwise, show that XY = YX. 


7.6.4 Deduce from Exercise 7.6.3 that eX e’ = e**” if and only if XY =YX. 


7.7 Eichler’s proof of Campbell—Baker—Hausdorff 


To facilitate an inductive proof, we let 
Aba, Z=R(A,B)+F(A,B)+R(4,B)t+--, ©) 


where F,,(A,B) is the sum of all the terms of degree n in Z, and hence is a 
homogeneous polynomial of degree n in the variables A and B. Since the 
variables stand for matrices in the Lie algebra g, they do not generally com- 
mute, but their product is associative. From the calculation in the previous 
section we have 


1 1 
Fi(4,B)=A+B, F,(A,B) = 5(AB—BA) = IA, BI. 


We will call a polynomial p(A,B,C,...) Lie if it is a linear combination 
of A,B,C,... and (possibly nested) Lie bracket terms in A,B,C, .... Thus 
F,(A,B) and F)(A,B) are Lie polynomials, and the theorem we wish to 
prove is: 


Campbell-Baker-Hausdorff theorem. For each n > 1, the polynomial 
F(A, B) in (*) is Lie. 


Proof. Since products of A,B,C,... are associative, the same is true of 
products of power series in A,B,C,..., so for any A,B,C we have 


(AP) = A(X), 
and therefore, if e4e%e© = eV, 


co co 


w= 3A (Ena) -$a(s.Ei@0). (1) 
i j=1 j j=l 


i=1 i=] 


Our induction hypothesis is that F,, is a Lie polynomial for m < n, and we 
wish to prove that F;, is Lie. 


7.7 Eichler’s proof of Campbell—Baker—Hausdorff 155 


The induction hypothesis implies that all homogeneous terms of degree 
less than n in both expressions for W in (1) are Lie, and so too are the 
homogeneous terms of degree n resulting from i > 1 and j > 1. The only 
possible exceptions are the polynomials 


F,(A,B)+F,(A+B,C) on the left (from i= 1, j =n andi=n,j = 1), 
F,(A,B+C)+F,(B,C) on the right (from i =n, j = 1 andi=1,j =n). 


Therefore, equating terms of degree n on both sides of (1), we find that the 
difference between the exceptional polynomials is a Lie polynomial. This 
property is a congruence relation between polynomials that we write as 


F,(A,B)+F,(A+B,C) Sie F,(A,B+C) + F,(B,C). (2) 


Relation (2) yields many consequences, by substituting special values of 
the variables A, B, and C, and from it we eventually derive F,,(A, B) =rie 0, 
thus proving the desired result that F,, is Lie. 

Before we start substituting, here are three general facts concerning 
real multiples of the variables. 


1. F,(rA,sA) =0, because the matrices rA and sA commute and hence 
ees — TATA That is, Z = F\(rA,sA), so all other F,(rA,sA) = 0. 


2. In particular, r= 1 and s = 0 gives F,,(A,0) = 0. 

3. F,(rA,rB) = r"F,,(A,B) because F,, is homogeneous of degree n. 
These facts guide the following substitutions in the congruence (2). 

First, replace C by —B in (2), obtaining 


F,(A,B)+F,(A+B,—B) = ie F,(A,0) + F,(B, —B) 
=rie 0 by facts 2 and 1. 


Therefore 
F,(A,B) =Lie = FA +B,—B). (3) 


Then replace A by —B in (2), obtaining 
F,(—B,B)+F,(0,C) =tie F,(—B,B+C)+F,(B,C), 
which gives, by facts 1 and 2 again, 


0 =Lie F,(-—B,B+C) + F,(B,C). 
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Next, replacing B, C by A, B respectively gives 

0 =Lie F,(—A,A +B) + F,(A,B), 
and hence 

F,(A,B) =tie —Fn(—A,A +B). (4) 
Relations (3) and (4) allow us to relate F,,(A,B) to F,,(B,A) as follows: 


F,({A,B) =tie —F,(—A,A+B) by (4) 
=Lie —(—F,(-A+A+B,—-A-—B)) by (3) 
Ste F(B,-A—B) 
=Lie —F,(-B,—-A) by (4) 
=rie —(—1)"F,(B,A) _ by fact 3. 


Thus the relation between F,,(A,B) and F,,(B,A) is 
F,(A,B) = ie —(—1)"F,(B,A). (5) 
Second, we replace C by —B/2 in (2), which gives 


F,(A,B)+ F,(A+B,—B/2) =Lie F,(A,B/2) + F,(B, —B/2) 
=Lie F,(A,B/2) by fact 1, 


so 
F(A, B) =Lie F,(A,B/2)—F,{A + B,—B/2). (6) 


Next, replacing A by —B/2 in (2) gives 
F,(—B/2,B) + F,(B/2,C) =tie F,(—B/2,B+C)+F,(B,C), 
and therefore, by fact 1, 
F,(B/2,C) = ie Fn(—B/2,B+C) + F,(B,C). 
Then, replacing B, C by A, B respectively gives 
F,(A/2,B) tie Fa(—A/2,A +B) + F,(A,B), 


that is, 
F,(A,B) =tie Fy(A/2,B) — F,(—A/2,A 4+ B). (7) 
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Relations (6) and (7) allow us to pass from polynomials in A, B to 
polynomials in A/2, B/2, paving the way for another application of fact 3 
and a new relation, between F,,(A, B) and itself. 

Relation (6) allows us to rewrite the two terms on the right side of (7) 
as follows: 

F,(A/2,B) =Lie F,(A/2,B/2) =F, lA/2 +B, —B/2) by (6) 
=ie Fn(A/2,B/2) + Fy(A/2+B/2,B/2) by 3) 
=lie 2-"F,(A,B)+2°-"F,(A+B,B) by fact 3, 

F,(—-A/2,A + B) 
=Lie Fn(A/2,A/2+B/2)—Fn(A/2+B,—A/2—B/2) by (6) 
=Lie — F,(A/2,B/2) + F,(B/2,A/2 + B/2) by (4) and (3) 
=lie —2 "F,(A,B)+2-"F,(B,A+B) _ by fact 3. 

So (7) becomes 
PAAR) Sie 2 AB) ALES) = 2 BAe): 
and, with the help of (5), this simplifies to 
(1—2')F,(A,B) Sie 2” (1+ (-1)") Fx(A+B,B). (8) 
If n is odd, (8) already shows that F;,(A,B) =tie 0. 
If n is even, we replace A by A — B in (8), obtaining 
(1=2! ")\F(A=B.B) Sige 2 AA,B): (9) 
The left side of (9) 


(1—2'"")F,(A—B,B) =1ie —(1—2'")F,(A,—B) _ by (3), 
so, making this replacement, (9) becomes 
1l—n 
1—2!-" 
Finally, replacing B by —B in (10), we get 


— F,(A,—B) =Lie F,{A,B). (10) 


l-n 


ee: 


gin 2 
=Lie -(5) F,(A,B) by (10), 


—F,(A,B) =tie F,,(A, —B) 


and this implies F;,(A,B) =ie 0, as required. 
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Exercises 
The congruence relation (6) 
F,(A,B) =1ie — (—1)"Fr(B,A) 
discovered in the above proof can be strengthened remarkably to 
F,(A,B) = —(—1)"F,,(B,A). 
Here is why. 


7.7.1 If Z(A,B) denotes the solution Z of the equation e4e? = e%, explain why 
Z(—B,—A) = —Z(A,B). 


7.7.2 Assuming that one may “equate coefficients” for power series in noncom- 
muting variables, deduce from Exercise 7.7.1 that 


F, (A,B) = —(—1)"Fa(B,A). 


7.8 Discussion 


The beautiful self-contained theory of matrix Lie groups seems to have 
been discovered by von Neumann [1929]. In this little-known paper? von 
Neumann defines the matrix Lie groups as closed subgroups of GL(n,C), 
and their “tangents” as limits of convergent sequences of matrices. In this 
chapter we have recapitulated some of von Neumann’s results, streamlin- 
ing them slightly by using now-standard techniques of calculus and linear 
algebra. In particular, we have followed von Neumann in using the ma- 
trix exponential and logarithm to move smoothly back and forth between 
a matrix Lie group and its tangent space, without appealing to existence 
theorems for inverse functions and the solution of differential equations. 

The idea of using matrix Lie groups to introduce Lie theory was sug- 
gested by Howe [1983]. The recent texts of Rossmann [2002], Hall [2003], 
and Tapp [2005] take up this suggestion, but they move away from the ideas 
of von Neumann cited by Howe. All put similar theorems on center stage— 
viewing the Lie algebra g of G as both the tangent space and the domain 
of the exponential function—but they rely on analytic existence theorems 
rather than on von Neumann’s rock-bottom approach through convergent 
sequences of matrices. 


5The only book I know that gives due credit to von Neumann’s paper is Godement 
[2004], where it is described on p. 69 as “the best possible introduction to Lie groups” and 
“the first ‘proper’ exposition of the subject.” 
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Indeed, von Neumann’s purpose in pursuing elementary constructions 
in Lie theory was to explain why continuity apparently implies differentia- 
bility for groups, a question raised by Hilbert in 1900 that became known as 
Hilbert’s fifth problem. It would take us too far afield to explain Hilbert’s 
fifth problem more precisely than we have already done in Section 7.3, 
other than to say that von Neumann showed that the answer is yes for com- 
pact groups, and that Gleason, Montgomery, and Zippin showed in 1952 
that the answer is yes for all groups. 

As mentioned in Section 4.7, Hamilton made the first extension of the 
exponential function to a noncommutative domain by defining it for quater- 
nions in 1843. He observed almost immediately that it maps the pure imag- 
inary quaternions onto the unit quaternions, and that eft? = efef when 
qq’ = qq. He took the idea further in his Elements of Quaternions of 
1866, realizing that ef is not usually equal to e4e7 , because of the non- 
commutative quaternion product. On p. 425 of Volume I he actually finds 
the second-order approximation to the Campbell—Baker—Hausdorff series: 


a 
etd — ef = oe + terms of third and higher dimensions. 


The early proofs (or attempted proofs) of the general Campbell—Baker— 
Hausdorff theorem around 1900 were extremely lengthy—around 20 pages. 
The situation did not improve when Bourbaki developed a more concep- 
tual approach to the theorem in the 1960s. See for example Serre [1965], or 
Section 4 or Bourbaki [1972], Chapter II. Bourbaki believes that the proper 
setting for the theorem is in the framework of free magmas, free algebras, 
free groups, and free Lie algebras, all of which takes longer to explain 
than the proofs by Campbell, Baker, and Hausdorff. It seems to me that 
these proofs are totally outclassed by the Eichler proof I have used in this 
chapter, which assumes only that the variables A, B, C have an associative 
product, and uses only calculations that a high-school student can follow. 

Martin Eichler (1912-1992) was a German mathematician (later living 
in Switzerland) who worked mainly in number theory and related parts of 
algebra and analysis. A famous saying, attributed to him, is that there are 
five fundamental operations of arithmetic: addition, subtraction, multipli- 
cation, division, and modular forms. Some of his work involves orthogonal 
groups, but nevertheless his 1968 paper on the Campbell—Baker—Hausdorff 
theorem seems to come out of the blue. Perhaps this is a case in which an 
outsider saw the essence of a theorem more clearly than the experts. 
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Topology 


PREVIEW 


One of the essential properties of a Lie group G is that the product and in- 
verse operations on G are continuous functions. Consequently, there comes 
a point in Lie theory where it is necessary to study the theory of continuity, 
that is, topology. Our journey has now reached that point. 

We introduce the concepts of open and closed sets, in the concrete set- 
ting of k-dimensional Euclidean space R*, and use them to explain the re- 
lated concepts of continuity, compactness, paths, path-connectedness, and 
simple connectedness. The first fruit of this development is a topological 
characterization of matrix Lie groups, defined in Section 7.2 through the 
limit concept. 

All such groups are subgroups of the general linear group GL(n,C) 
of invertible complex matrices, for some n. They are precisely the closed 
subgroups of GL(n,C). 

The concepts of compactness and path-connectedness serve to refine 
this description. For example, O(n) and SO(n) are compact but GL(n,C) 
is not; SO(n) is path-connected but O(7) is not. 

Finally, we introduce the concept of deformation of paths, which al- 
lows us to define simple connectivity. A simply connected space is one in 
which any two paths between two points are deformable into each other. 
This refines the qualitative description of Lie groups further—for exam- 
ple, SU(2) is simply connected but SO(2) is not—but simply connected 
groups have a deeper importance that will emerge when we reconnect with 
Lie algebras in the next chapter. 


160 J. Stillwell, Naive Lie Theory, DOI: 10.1007/978-0-387-78214-0_8, 
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8.1 Open and closed sets in Euclidean space 

The geometric setting used throughout this book is the Euclidean space 
R‘ = + Miphostocg he) s MigMosnscgae = lets 

with distance d(X ,Y) between points 


A = (Ki,do,e00,%e) and. Y= O199,<1.49%) 


defined by 
A(X,Y) = 4/ 1 —y1)? + (42 — yo)? +++ + (XK — ye). 


This is the distance on Ré that is invariant under the transformations in 
the group O(k) and its subgroup SO(k). Also, when we interpret C” as 
R?” by letting the point (x, + ix} ,x2 + ixX4,...,%» + ix) € C” correspond 
to the point (1,2) ,%2,%4,---,%n,X),) € R2” then the distance defined by the 
Hermitian inner product on C” is the same as the Euclidean distance on 
IR2”, as we saw in Section 3.3. Likewise, the distance on H” defined by its 
Hermitian inner product is the same as the Euclidean distance on R*”. 

As in Section 4.5 we view ann x n real matrix A with (i, j)-entry aj; 
as the point (a11,412,---,@1n,421,---;@nn) € R”, and define the absolute 


value |A| of A as the Euclidean distance ,/}; ; Gi, of this point from 0 in 


R”™. We similarly define the absolute value of n x n complex and quater- 
nion matrices by interpreting them as points of R2” and R*”, respectively. 
Then if we take the distance between matrices A and B of the same size 
and type to be |A — B|, we can speak of a convergent sequence of matrices 
A\,A2,A3,... with limit A, or of a continuous matrix-valued function A(t) 
by using the usual definitions in terms of distance € from the limit. 

Topology gives a general language for the discussion of limits and con- 
tinuity by expressing them in terms of open sets. 


Open and closed sets 


To be able to express the idea of a “neighborhood” concisely we introduce 
the notation Ne(P) for the open e€-ball with center P, that is, 


N,(P) = {Q € RB‘: |P—Q| <e}. 
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The set Ne(P) is also called the ¢-neighborhood of P. 

A set @ C R* is called open if, along with any point P € @, there is an 
€-neighborhood N,(P) C @ for some € > 0. Three properties of open sets 
follow almost immediately from this definition.® 


1. Both R* and the empty set {} are open. 
2. Any union of open sets is open. 


3. The intersection of two (and hence any finite number of) open sets is 
open. 


The third property holds because if P € O, and P € @> then we have 
PEN,z(P)CG, and PEN, (P)C Go, 


so P E Ne(P) C 61M Go, where € is the minimum of €; and €. 

Open sets are the fundamental concept of topology, and all other topo- 
logical concepts can be defined in terms of them. For example, a closed 
set’ ¥ is one whose complement R* — F is open. It follows from prop- 
erties 1, 2, 3 of open sets that we have the following properties of closed 
sets: 


1. Both Ré and the empty set {} are closed. 
2. Any intersection of closed sets is closed. 


3. The union of two (and hence any finite number of) closed sets is 
closed. 


The reason for calling such sets “closed” is that they are closed under the 
operation of adding limit points. A limit point of a set .Y is a point P 
such that every €-neighborhood of P includes points of .7. A closed set F 
includes all its limit points P. This is so because if P is a point not in .F 
then P is in the open complement R‘ — ¥ and hence P has a neighborhood 
N,(P) CR‘ — ¥. But then N,(P) does not include any points of .F, so P 
is not a limit point of .F. 


In general topology, where R* is replaced by an arbitrary set .7, these three properties 
define what is called a collection of open sets. In general topology there need be no under- 
lying concept of “distance,” hence open sets cannot always be defined in terms of €-balls. 
We will make use of the concept of distance where it is convenient, but it will be noticed 
that the general topological properties of open sets frequently give a natural proof. 

7It is traditional to denote closed sets by the initial letter of “fermé,” the French word 
for “closed.” 
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The relative topology 


Many spaces .¥ other than R* have a notion of distance, so the definition 
of open and closed sets in terms of €-balls may be carried over directly. In 
particular, if Y is a subset of some R* we have: 


e The €-balls of 7, Ne(P) ={Q€ % : |P—Q| < €}, are the intersec- 
tions of Y with e-balls of R*. 


e So the open subsets of .” are the intersections of .Y with the open 
subsets of R*, 


e So the closed subsets of .Y are the intersections of .Y with the closed 
subsets of R¥. 


The topology resulting from this definition of open set is called the relative 
topology on .%. It is important at a few places in this chapter, notably for 
the definition of a matrix Lie group in the next section. 

Notice that .Y is automatically a closed set in the relative topology, 
since it is the intersection of .7 with a closed subset of R*, namely R* 
itself. This does not imply that .Y contains all its limit points; indeed, this 
happens only if .Y is a closed subset of R*. 


Exercises 


Open sets and closed sets are common in mathematics. For example, an open 
interval (a,b) = {x € R: a <x <b} is an open subset of R and a closed interval 
[a,b] = {x € R:a<x<)} is closed. 


8.1.1 Show that a half-open interval |a,b) = {x:a< x < b} is neither open nor 
closed. 


8.1.2 With the help of Exercise 8.1.1, or otherwise, give an example of an infinite 
union of closed sets that is not closed. 


8.1.3 Give an example of an infinite intersection of open sets that is not open. 


Since a random subset 7 of a space .Y may not be closed we sometimes 
find it convenient to introduce a closure operation that takes the intersection of all 
closed sets ¥ D 7: 


closure(.7) ={.¥ C.Y: F is closedand ¥ D TF}. 
8.1.4 Explain why closure(.7) is a closed set containing 7. 


8.1.5 Explain why it is reasonable to call closure(.7) the “smallest” closed set 
containing 7. 


8.1.6 Show that closure(.7) = 7 U {limit points of 7} when 7 CR‘, 
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8.2 Closed matrix groups 


In Lie theory, closed sets are important from the beginning, because all 
matrix Lie groups are closed sets in the appropriate topology. This has to do 
with the continuity of matrix multiplication and the determinant function, 
which we assume for now. In the next section we will discuss continuity 
and its relationship with open and closed sets more thoroughly. 


Example 1. The circle group S! = SO(2). 


Viewed as a set of points in C or R?, the unit circle is a closed set 
because its complement (the set of points not on the circle) is clearly open. 
Figure 8.1 shows a typical point P not on the circle and an €-neighborhood 
of P that lies in the complement of the circle. The open neighborhood of P 
is colored gray and its perimeter is drawn dotted to indicate that boundary 
points are not included. 


Figure 8.1: Why the complement of the circle is open. 


Example 2. The groups O(7) and SO(n). 


We view O(n) as a subset of the space R” of nx nreal matrices, which 
we also call M,,(R). The complement of O(7) is 


M,,(R) — O(n) = {A € M,(R): AAT #1}. 


This set is open because if A is a matrix in M,,(IR) with AA? #1 then 
some entries of AA‘ are unequal to the corresponding entries (1 or 0) in 1. 
It follows, since matrix multiplication and transpose are continuous, that 
BB" also has entries unequal to the corresponding entries of 1 for any B 
sufficiently close to A. Thus some €-neighborhood of A is contained in 
M,,(R) — O(n), so O(n) is closed. 
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Matrices A in SO(n) satisfy the additional condition det(A) = 1. The 
matrices A not satisfying this condition form an open set because det is 
a continuous function. Namely, if det(A) 4 1, then det(B) A 1 for any B 
sufficiently close to A; hence any A in the set where det 4 1 has a whole 
€-neighborhood in this set. Thus the matrices A for which det(A) = | form 
a closed set. The group SO(n) is the intersection of this closed set with the 
closed set O(n), hence SO(n) is itself closed. 


Example 3. The group Aff(1). 


We view Aff(1) as in Section 4.6, namely, as the group of real matrices 
of the form A = (a *) , where a,b € R and a > 0. It is now easy to see that 
the group is not closed, because it contains the sequence 


hel 5) 


whose limit (§ 9) is not in Aff(1). However, Aff(1) is closed in the “rel- 
ative” sense: as a subset of the largest 2 x 2 matrix group that contains it. 
This is because Aff(1) is the intersection of a closed set—the set of ma- 
trices (2 P) with a > 0—with the set of all invertible 2 x 2 matrices. This 
brings us to our next example. 


Example 4. The general linear group GL(n,C). 


The group GL(n,C) is the set of all invertible n x n complex matrices. 
This set is a group because it is closed under products (since A~'B~! = 
(BA)~') and under inverses (obviously). It follows that every group of real 
or complex matrices is a subgroup of some GL(n,C),® which is why we 
bring it up now. We are about to define what a “matrix Lie group” is, and 
we wish to say that it is some kind of subgroup of GL(n,C). 

But first notice that GL(n, C) is not a closed subset of the space M,,(C) 
of n xn complex matrices. Indeed, if 1 is the n x n identity matrix, then the 
matrices 1/2,1/3,1/4,... all belong to GL(n,C) but their limit 0 does not. 
We can say only that GL(n, C) is a closed subset of itse/f, and the definition 
of matrix Lie group turns upon this appeal to the relative topology. 


8GL(n, C) was called “Her All-embracing Majesty” by Hermann Weyl in his book The 
Classical Groups. Notice that quaternion groups may also be viewed as subgroups of 
GL(n,C), thanks to the identification of quaternions with certain 2 x 2 complex matrices 
in Section 1.3. 
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Matrix Lie groups 

With the understanding that the topology of all matrix groups should be 
considered relative to GL(n,C), we make the following definition: 
Definition. A matrix Lie group is a closed subgroup of GL(n,C). 


This definition is beautifully simple, but still surprising. Lie groups are 
supposed to be “smooth,” yet closed sets are not usually smooth (think of a 
square or a triangle, say). Apparently the group operation has a “smooth- 
ing” effect. And again, there are some closed subgroups of GL(n,C) that 
do not even /ook smooth, for example the group {1} consisting of a single 
point! The worry about {1} disappears when one takes a sufficiently gen- 
eral definition of “smoothness,” as explained in Section 5.8. The real secret 
of smoothness is the matrix exponential function, as we saw in Section 7.3. 


Exercises 


8.2.1 Prove that U(n), SU(n), and Sp(n) are closed subsets of the appropriate 
matrix spaces. 


The general linear group GL(n, C) is usually introduced alongside the special 
linear group. Both are subsets of the space M,,(C) of complex n x n matrices. 


GL(n,C) = {A:det(A) 40} and SL(n,C) = {A: det(A) = 1}. 
8.2.2 Show that GL(n, C) is an open subset of M,,(C). 


8.2.3 Show that SL(n,C) is a closed subset of M,(C). 
8.2.4 If H is an arbitrary subgroup of a matrix Lie group G, show that 


{sequential tangents of H} = T,(closure(H)). 


8.3 Continuous functions 


As in elementary analysis, we define a function f to be continuous at a 
point A if, for each € > 0, there is a 6 > O such that 


|B—A| <6 = |f(B)—f(A)| <e. 


If the points A and B belong to R* and the values f(A) and f(B) belong 
to R! then the ¢-6 condition can be restated as follows: for each €-ball 
Ne(f(A)) there is a 6-ball Ng(A) such that 


{f(B) : BE N5(A)} S Ne(f(A))- 2 


8.3. Continuous functions 167 


It is convenient and natural to introduce the abbreviations 
f(%) for {f(B):Be.7}, fO(H) for {B: f(B) €.7}. 


Then the condition (*) can be restated: f is continuous at A if, for each 
€ > 0, there is a 6 > 0 such that 


F(Ns(A)) & Ne(f(A))- 


Finally, if f is continuous for some domain of argument values A and some 
range of function values f(A) then, for each open subset C of the range of 
f, we have 

f-\(@) is open. (#*) 


This is because f~!(@) contains, along with each point A, a neighborhood 
Ng5(A) of A, mapped by f into an neighborhood N,(f(A)) of f(A), con- 
tained in the open set @ along with f(A). 

Condition (**) is equivalent to condition (*) in spaces such as R‘, and it 
serves as the definition of a continuous function in general topology, since 
it is phrased in terms of open sets alone. 


Basic continuous functions 


As one learns in elementary analysis, the basic functions of arithmetic are 
continuous at all points at which they are defined. Also, composites of con- 
tinuous functions are continuous. For example, the composite of addition, 
subtraction, and division given by 


a+b 


f(a,b) = 


is continuous for all pairs (a,b) at which it is defined—that is, for all pairs 
such that a 4 b. 

A matrix function f is called continuous at A if it satisfies the €-6 
definition for absolute value of matrices. That is, for all € there is a 6 such 
that 

|B-A| <6 > |f(B) — f(A)| <e. 


This is equivalent to being a continuous numerical function of the matrix 
entries. Important examples for Lie theory are the matrix product and the 
determinant, both of which are continuous because they are built from ad- 
dition and multiplication of numbers. The matrix inverse f(A) = A™! is 
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also a continuous function of A, built from addition, multiplication, and di- 
vision (by det(A)). It is defined for all A with det(A) 4 0, which of course 
are also the A for which A~! exists. 


Homeomorphisms 


Continuous functions might be considered the “homomorphisms” of topol- 
ogy, but if so an “isomorphism” is not simply a 1-to-1 homomorphism. A 
topological isomorphism should also have a continuous inverse. A contin- 
uous function f such that f~! exists and is continuous is called a homeo- 
morphism. We will also call such a 1-to-1 correspondence, continuous in 
both directions, a continuous bijection. 

We must specifically demand a continuous inverse because the inverse 
of acontinuous 1-to-1 function is not necessarily continuous. The simplest 
example is the map from the half-open interval [0,272) to the circle defined 
by f(@) =cos @ + isin @ (Figure 8.2). 


po zR 


Figure 8.2: The interval and the circle. 


This map f is clearly continuous and 1-to-1, but f~! is not continuous. 
For example, f~!(@), where @ is a small open arc of the circle between 
angle —a and a, is (2a — a@,27) U[0,a), which is not an open set. (More 
informally, f~! sends points that are near each other on the circle to points 
that are far apart on the interval.) 

It is clear that f is not an “isomorphism” between [0,27) and the circle, 
because the two spaces have different topological properties. For example, 
the circle is compact but [0,27:) is not. (For the definition of compactness, 
see the next section.) 


Exercises 


If homeomorphisms are the “isomorphisms” of topological spaces, what operation 
do they preserve? The answer is that homeomorphisms are the 1-to-1 functions f 
that preserve closures, where “closure” is defined in the exercises to Section 8.1: 


f(closure(.7)) = closure(f(-7)). 
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8.3.1 Show that if P is a limit point of .Y and f is a continuous function defined 
on .Y and P, then f(P) is a limit point of f(.7). 


8.3.2 If f is a continuous bijection, deduce from Exercise 8.3.1 that 


f(closure(.7)) = closure(f(-7)). 


8.3.3. Give examples of continuous functions f on subsets of R such that f(open) 
is not open and f(closed) is not closed. 


8.3.4 Also, give an example of a continuous function f on R and a set .Y such 
that 
f(closure(.7)) ¢ closure(f(-7)). 


8.4 Compact sets 


A compact set in R* is one that is closed and bounded. Compact sets 
are somewhat better behaved than unbounded closed sets; for example, on 
a compact set a continuous function is uniformly continuous, and a real- 
valued continuous function attains a maximum and a minimum value. One 
learns these results in an introductory real analysis course, but we will 
prove one version of uniform continuity below. In Lie theory, compact 
groups are better behaved than noncompact ones, and fortunately most of 
the classical groups are compact. 

We already know from Section 8.2 that O(n) and SO(7) are closed. To 
see why they are compact, recall from Section 3.1 that the columns of any 
A € O(n) form an orthonormal basis of R”. This implies that the sum of 
the squares of the entries in any column is 1, hence the sum of the squares 
of all entries is n. In other words, |A| = \/n, so O(n) is a closed subset of 
R” bounded by radius \/7. 

There are similar proofs that U(n), SU(n), and Sp() are compact. 

Compactness may also be defined in terms of open sets, and hence it is 
meaningful in spaces without a concept of distance. The definition is moti- 
vated by the following classical theorem, which expresses the compactness 
of the unit interval [0, 1] in terms of open sets. 


Heine-Borel theorem. /f [0,1] is contained in a union of open intervals 
U;, then the union of finitely many %; also contains |0, 1]. 


Proof. Suppose, on the contrary, that no finite union of the % contains 
[0,1]. Then at least one of the subintervals [0,1/2] or [1/2,1] is not con- 
tained in a finite union of Y; (because if both halves are contained in the 
union of finitely many %, so is the whole). 
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Pick, say, the leftmost of the two intervals [0, 1/2] and [1/2, 1] not con- 
tained in a finite union of %; and divide it into halves similarly. By the same 
argument, one of the new subintervals is not contained in a finite union of 
the %, and so on. 

By repeating this argument indefinitely, we get an infinite sequence of 
intervals [0,1] = 4% > .%D.42D.-.--. Each 4%, is half the length of -%, 
and none of them is contained in the union of finitely many %;. But there 
is a single point P in all the .4%, (namely the common limit of their left and 
right endpoints), and P € [0, 1] so P is in some Yj. 

This is a contradiction, because a sufficiently small .%, containing P is 
contained in %;, since Y%; is open. So in fact [0, 1] is contained in the union 
of finitely many %. 


The general definition of compactness motivated by this theorem is the 
following. A set % is called compact if, for any collection of open sets 
@; whose union contains K, there is a finite subcollection O,, O2,..., Qin 
whose union contains #. The collection of sets @; is said to be an “open 
cover” of .%, and the subcollection 0, @2,...,@n is said to be a “finite 
subcover,” so the defining property of compactness is often expressed as 
“any open cover contains a finite subcover.” 

The argument used to prove the Heine—Borel theorem is known as the 
“bisection argument,” and it easily generalizes to a “2*-section argument” 
in R*, proving that any closed bounded set has the finite subcover property. 

For example, given a closed, bounded set .# in R*, we take a square 
that contains .% and consider the subsets of .% obtained by dividing the 
square into four equal subsquares, then dividing the subsquares, and so on. 
If .% has no finite subcover, then the same is true of a nested sequence of 
subsets with a single common point P, which leads to a contradiction as in 
the proof for [0, 1]. 


Exercises 


The bisection argument is also effective in another classical theorem about the 
unit interval: the Bolzano—Weierstrass theorem, which states that any infinite set 
of points {P;,P2,P3,...} in [0, 1] has a limit point. 


8.4.1 Given an infinite set of points {P,,P),P3,...} in [0, 1], conclude that at least 
one of the subintervals [0, 1/2], [1/2, 1] contains infinitely many of the P;. 


8.4.2 (Bolzano—Weierstrass). By repeated bisection, show that there is a point P 
in 0, 1], every neighborhood of which contains some of the points P,. 
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8.4.3 Generalize the argument of Exercise 8.4.2 to show that if .% is a closed 
bounded set in R* containing an infinite set of points {P,,P2,P3,...} then 
HX includes a limit point of {P), P2,P3,...}. 


(We used a special case of this theorem in Section 7.4 in claiming that an infi- 
nite sequence of points on the unit sphere has a limit point, and hence a convergent 
subsequence.) 

The generalized Bolzano—Weierstrass theorem of Exercise 8.4.3 may also be 
proved very naturally using the finite subcover property of compactness. Suppose, 
for the sake of contradiction, that {P,,P),P3,...} is an infinite set of points in a 
compact set K, with no limit point in K. It follows that each point Q € K has an 
open neighborhood -/ (Q) in .% (the intersection of an open set with .%) free of 


points P; 4 Q. 


8.4.4 By taking a finite subcover of the cover of .% by the sets .”(Q), show that 
the assumption leads to a contradiction. 


Not all matrix Lie groups are compact. 


8.4.5 Show that GL(n,C) and SL(n,C) are not compact. 


8.5 Continuous functions and compactness 


We saw in Section 8.3 and its exercises that continuous functions do not 
necessarily preserve open sets or closed sets. However, they do preserve 
compact sets, so this is another example of “better behavior” of compact 
sets. The proof also shows the efficiency of the finite subcover property of 
compactness. 


Continuous image of a compact set. [f % is compact and f is a contin- 
uous function defined on H then f() is compact. 


Proof. Given a collection of open sets @; that covers f(.%), we have to 

show that some finite subcollection @), @2,...,@, also covers f(%). 
Well, since f is continuous and @; is open, we know that f~!(@;) is 

open by Property (**) in Section 8.3. Also, the open sets f~!(@;) cover 

HK because the G; cover f(.%). Therefore, by compactness of .%, there 

is a finite subcollection f~!(@,), f-'(@2),...,f-'(@m) that covers K. 
But then 0), O2,...,@, covers f(.%), as required. 


It may be thought that a problem arises when the open sets @; extend 
outside f(.%), possibly outside the range of the function f. We avoid this 
problem by considering only open subsets relative to “ and f(%), that 
is, the intersections of open sets with .% and f(.%). For such sets it is still 
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true that f~' (“open”) = “open” when f is continuous, and so the argument 
goes through. 

A convenient property of continuous functions on compact sets is uni- 
form continuity. As always, a continuous f : .% — 7 has the property 
that for each € > 0 there is a 6 > 0 such that f maps a 6-neighborhood of 
each point P € .Y into an €-neighborhood of f(P) € 7. We say that f is 
uniformly continuous if 6 depends only on €, not on P. 


Uniform continuity. [f “ is a compact subset of R" and f : H — R" is 
continuous, then f is uniformly continuous. 


Proof. Since f is continuous, for any € > 0 and any P € .# there is a neigh- 
borhood N5:p)(P) mapped by f into Ne/2(f(P)). To create some room to 
move later, we cover .# with the half-sized neighborhoods N5:p)/2(P), 
then apply compactness to conclude that % is contained in some finite 
union of them, say 


KH CS No p,)/2( Pi) UNg(p,y2(P2) U + UN5(p,2(Pr)- 


If we let 
6 = min{6(P,)/2,6(P))/2,...,6(P)/2}, 


then each point in .# lies in a set Ngp) /2(P;) and each of the sets Ng,p) (Fi) 
has radius at least 26. I claim that |Q —R| < 6 implies |f(Q) — f(R)| < € 
for any Q,R € .#, so f is uniformly continuous on .%. 

To see why, take any Q,R € -% such that |Q — R| < 6 and a half-sized 
neighborhood Ng p)/2(P;) that includes Q. Then 


IP;-Q|<6 and |Q-R|<6, 
so it follows by the triangle inequality that 
|P,—R|<26, andhence R€N5p)(Pi). 


Also, it follows from the definition of Ng:p)(P;) that |f(P;) — f(Q)| < €/2 
and | f(P;) — f(R)| < €/2, so 


If(Q) — F(R) <e, 


again by the triangle inequality. 
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Exercises 


The above proof of uniform continuity is complicated by the possibility that % is 
at least two-dimensional. This forces us to use triangles and the triangle inequality. 
If we have .% = [0,1] then a more straightforward proof exists. 


8.5.1 Suppose that Ns p,)(Pi) U Ngvp,) (P2) U--- UNgyp,) (Px) is a finite union of 
open intervals that contains [0, 1]. 
Use the finitely many endpoints of these intervals to define a number 6 > 0 
such that any two points P,Q € [0,1] with |P — Q| < 6 lie in the same 
interval N5:p,) (Pi). 


8.5.2 Deduce from Exercise 8.5.1 that any continuous function on [0,1] is uni- 
formly continuous. 


8.6 Paths and path-connectedness 


The idea of a “curve” or “path” has evolved considerably over the course 
of mathematical history. The old term Jocus (meaning place in Latin), 
shows that a curve was once considered to be the (set of) places occupied 
by points satisfying a certain geometric condition. For example, a circle is 
the locus of points at a constant distance from a particular point, the center 
of the circle. Later, under the influence of dynamics, a curve came to be 
viewed as the orbit of a point moving according to some law of motion, 
such as Newton’s law of gravitation. The position p(t) of the moving point 
at any time ¢ is some continuous function of f. 

In topology today, we take the function itself to be the curve. That is, 
a curve or path in a space -Y is a continuous function p: [0,1] — 7. The 
interval [0,1] plays the role of the time interval over which the point is in 
motion—any interval would do as well, and it is sometimes convenient to 
allow arbitrary closed intervals, as we will do below. More importantly, 
the path is the function p and not just its image. A case in which the 
image fails quite spectacularly to reflect the function is the space-filling 
curve discovered by Peano in 1890. The image of Peano’s curve is a square 
region of the plane, so the image cannot tell us even the endpoints A = f(0) 
and B = f(1) of the curve, let alone how the curve makes its way from A 
to B. 

In Lie theory, paths give a way to distinguish groups that are “all of a 
piece,” such as the circle group SO(2), from groups that consist of “sep- 
arate pieces,” such as O(2). In Chapter 3 we showed connectedness by 


174 8 Topology 


describing specific paths. In the present chapter we wish to discuss paths 
more generally, so we introduce the following general definitions. 


Definitions. A path in a set G is a continuous map p:/ — G, where 
I = [a,b] is some closed interval? of real numbers. A set G is called path- 
connected if, for any A,B € G, there is a path p: [a,b] > G with p(a) =A 
and p(b) = B. If p isa path from A to B with domain [a,b] and q is a path 
from B to C with domain [b,c] then we call the path pq defined by 


ee (t) ift € [a,b], 
2 att) ={ at) if t € [b,c], 


the concatenation of p and q. 


Clearly, if there is a path p from A to B with domain [a,b] then there is 
a path p’ from A to B with any closed interval as domain. Thus if there are 
paths from A to B and from B to C we can always arrange for the domains 
of these paths to be contiguous intervals, so the concatenation of the two 
paths is defined. Indeed, we can insist that all paths have domain [0,1], at 
the cost of a slightly less natural definition of concatenation (this is often 
done in topology books). 

Whichever definition is chosen, one has the following consequences: 


e If there is a path from A to B then there is a “reverse” path from B 
to A. (If p with domain [0,1] is a path from A to B, consider the 
function g(t) = p(1—t).) 


e If there are paths in G from A to B, and from B to C, then there is a 
path in G from A to C. (Concatenate.) 


e If G®° is the subset of G consisting of all A € G for which there is 
a path from 1 to A, then G® is path-connected. (For any B,C € G, 
concatenate the paths from B to 1 and from 1 to C.) 


In a group G, the path-connected subset G° just described is called the 
path-component of the identity, or simply the identity component. The set 
G° has significant algebraic properties. These properties were explored 
in some exercises in Chapter 3, but the following theorem and its proof 
develop them more precisely. 


°We regret that mathematicians use the [ , ] notation for both closed intervals and Lie 
brackets, but it should always be clear from the context which meaning is intended. 
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Normality of the identity component. /f G° is the identity component of 
a matrix Lie group G, then G° is a normal subgroup of G. 


Proof. First we prove that G° is a subgroup of G by showing that G? is 
closed under products and inverses. 

If A,B € G® then there are paths A(t) from 1 to A and B(t) from 1 to B. 
Since matrix multiplication is continuous, AB(t) is a path in G from A to 
AB, so it follows by concatenation of paths from 1 to A and from A to AB 
that AB € G?. Similarly, A~'A(r) is a path in G from A“! to 1, so it follows 
by path reversal that A~! is also in G?. 

To prove that G? is normal we need to show that AG°A~! = G? for each 
A €G. It suffices to prove that AG°A~! C G? for each A € G, because in 
that case we have G? C A~'G?A (multiplying the containment on the left 
by A”! and on the right by A), and hence also G? C AG°A™! (replacing the 
arbitrary A by A~!). 

It is true that AG°A~! C G°, because AG°A~! isa path-connected set— 
the image of G° under the continuous maps of left and right multiplication 
by A and A !—and it includes the identity element of G as A1A~!. 


It follows from this theorem that a non-discrete matrix Lie group is not 
simple unless it is path-connected. We know from Chapter 3 that O(n) is 
not path-connected for any n, and that SO(n), SU(n), and Sp(7) are path- 
connected for all n. Another interesting case, whose proof occurs as an 
exercise on p. 49 of Hall [2003], is the following. 


Path-connectedness of GL(n, C) 


Suppose that A and B are two matrices in GL(n,C), so det(A) 4 0 and 
det(B) 4 0. We wish to find a path from A to B in GL(n, C), that is, through 
the n x n complex matrices with nonzero determinant. 

We look for this path among the matrices of the form (1 — z)A +B, 
where z € C. These matrices form a plane, parameterized by the complex 
coordinate z, and the plane includes A at z= 0 and B at z= 1. The path 
from A to B has to avoid matrices (1 — z)A + zB for which 


det((1 —z)A+ zB) =0. (*) 


Now (1 —z)A+2B is an n x n complex matrix whose entries are linear 
terms in z. Its determinant is therefore a polynomial of degree at most n in 
zand so, by the fundamental theorem of algebra, equation (*) has at most 
n roots. 
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These roots represent points in the plane of matrices (1 — z)A + zB, 
not including the points A and B. This allows us to find a path, from A to B 
in the plane, avoiding the points with determinant zero, as required. Thus 
GL(n, C) is path-connected. 


Generating a path-connected group from a neighborhood of 1 


In Section 7.5 we claimed that any path-connected group matrix Lie group 
is generated by a neighborhood N¢(1) of 1, that is, any element of G is a 
product of members of N5(1). We can now prove this theorem with the 
help of compactness. 


Generating a path-connected group. /f G is a path-connected matrix Lie 
group, and N(1) is a neighborhood of 1 in G, then any element of G is a 
product of members of Ng5(1). 


Proof. Since G is path-connected, for any A € G there is a path A(r) in 
G with A(O) = 1 and A(1) =A. Also, for each f, multiplication by A(f) 
is a continuous map with a continuous inverse (namely, multiplication by 
A(t)~!). Hence, if @ is any open set that includes 1, the set 


A(t)6 = {A(t)B: Be O} 


is an open set that includes the point A(t). As ¢ runs from 0 to | the open 
sets A(t)@ cover the image of the path A(t), which is the continuous image 
of the compact set [0,1], hence compact by the first theorem in Section 8.5. 
So in fact the image of the path lies in a finite union of sets, 


A(ti)O UA(t)CU++-UA(K)G. 


We can therefore find points 1 = A;,A2,...,Am =A on the path A(r) 
such that, for any i, Aj and Aj;+, lie in the same set A(t;)@. Notice that 


A=A,-A;!A2-A5!A3 See A) A. 
We can arrange that each factor of this product is in Ns (1) by taking @ to be 
a subset of N3(1) small enough that B>'B;,, € N3(1) for any B;, Bis € @. 
Then for each i we have 
Ay Ans = (A(t;)Bi) ‘A(t))Biy1 for some t; and some Bj, Bj; € @ 
= B'Bix1 © N5(1). 
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Corollary. If G is a path-connected matrix Lie group then each element of 
G has the form e*' e* .--e for some X\,X2,...,Xm € T(G). 


Proof. Rerun the proof above with N5(1) chosen so that each element of 
Ng(1) has the form e*, as is permissible by the theorem of Section 7.4. 
Then each factor in the product 


A=A,-A]'A2-Az'A3°+** Ar) Am 


Xj 


has the form e 


Exercises 


The corollary brings to mind the element ie, are) of SL(2,C), shown not to be of 
the form e* for X € T(SL(2,C)) in Exercise 5.6.5. 


8.6.1 Write is i as the product of two matrices in SL(2,C) with entries 0, i, 
or —i. 


8.6.2 Deduce from Exercise 8.6.1 and Exercise 5.6.4 that (a ay = eXle% for 
some X1,X7 € %(SL(2,C)). 


8.7 Simple connectedness 


A space -¥ is called simply connected if it is path-connected and, for any 
two paths p and g in .Y from point A to point B, there is a deformation of 
p to qg with endpoints fixed. A deformation (or homotopy) of a path p to 
path g is a continuous function of two real variables, 


d:{0,1] x [0,1] - 7 


such that 
d(0,t) = p(t) and d(1,1)=4(0). 
And the endpoints are fixed if 
d(s,0) = p(0)=q(0) and d(s,1)=p(1)=q(1) foralls. 


Here one views the first variable as “time” and imagines a continuously 
moving curve that equals p at time 0 and q at time 1. Sod is a “deformation 
from curve p to curve g.” 
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The restriction of d to the bottom edge of the square [0, 1] x [0, 1] is one 
path p, the restriction to the top edge is another path g, and the restriction 
to the various horizontal sections of the square is a “continuous series” 
of paths between p and q. Figure 8.3 shows several of these sections, in 
different shades of gray, and their images under some continuous map d. 
These are “snapshots” of the deformation, so to speak.!° 


Im(q) 
d 
Im(p) 


Figure 8.3: Snapshots of a path deformation with endpoints fixed. 


Simple connectivity is easy to define, but is quite hard to demonstrate 
in all but the simplest case, which is that of R*. If p and q are paths in 
R* from A to B, then p and g may each be deformed into the line segment 
AB, and hence into each other. To deform p, say, one can move the point 
p(t) along the line segment from p(t) to the point (1 —r)A+7B, traveling 
a fraction s of the total distance along this line in time s. 

The next-simplest case, that of S* for k > 1, includes the important Lie 
group SU(2) = Sp(1)—the S* of unit quaternions. On the sphere there 
is not necessarily a unique “line segment” from p(t) to the point we may 
want to send it to, so the above argument for R* does not work. One can 
project S* minus one point P onto R‘, and then do the deformation in R*, 
but projection requires a point P not in the image of p, and hence it fails 
when p is a space-filling curve. To overcome the difficulty one appeals to 
compactness, which makes it possible to show that any path may be divided 
into a finite number of “small” pieces, each of which may be deformed on 


!ODefining simple connectivity in terms of deformation of paths between any two points 
A and B is convenient for our purposes, but there is a common equivalent definition in terms 
of closed paths: .7 is simply connected if every closed path may be deformed to a point. 
To see the equivalence, consider the closed path from A to B via p and back again via q. 
(Or, strictly speaking, via the “inverse of path q’” defined by the function g(1—1).) 


8.7 Simple connectedness 179 


the sphere to a “line segment” (a great circle arc). This clears space on the 
sphere that enables the projection method to work. For more details see the 
exercises below. 

Compactness is also important in proving that certain groups are not 
simply connected. The most important case is the circle S' = SO(2), which 
we now study in detail, because the idea of “lifting,” introduced here, will 
be important in Chapter 9. 


The circle and the line 


The function f(@) = (cos @,sin@) maps R onto the unit circle S!. It is 
called a covering of S! by R and the points @ + 2nz € R are said to lie over 
the point (cos 6,sin@) € S'. This map is far from being 1-to-1, because 
infinitely many points of R lie over each point of S'. For example, the 
points over (1,0) are the real numbers 2n7z for all integers n (Figure 8.4). 


—4r —20 0 20 4n 


C}a.9 


Figure 8.4: The covering of the circle by the line. 


However, the restriction of f to any interval of R with length < 27 
is 1-to-1 and continuous in both directions, so f may be called a local 
homeomorphism. Figure 8.4 shows an arc of S! (in gray) of length < 22 
and all the intervals of R mapped onto it by f. The restriction of f to any 
one of these gray intervals is a homeomorphism. 

The local homeomorphism property of f allows us to relate path defor- 
mations in S! to path deformations in R, which are more easily understood. 
The first step is the following theorem, relating paths in S! to paths in R by 
a process called lifting. 


Unique path lifting. Suppose that p is a path in S! with initial point P, 
and P is a point in R over Q. Then there is a unique path p in R such that 
p(0) =P and fop= p. We call p the lift of p with initial point P. 


Proof. The path p is a continuous function from [0,1] into S!, and hence it 
is uniformly continuous by the theorem in Section 8.5. This means that we 
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can divide (0, 1] into a finite number of subintervals, say .4,,-4,...,-% in 
left-to-right order, each of which is mapped by p into an arc of S! of length 
<2. We let p; be the restriction of p to .%; and allow the term “path” to 
include all continuous functions on intervals of R. 

Then, since f has a continuous inverse on intervals of length < 27: 


e There is a unique path p; : 4% — R, with initial point P, such that 
fop: = pi. Namely, p(t) = f~'(pi(t)), where f—! is the inverse 
of f in the neighborhood of P. Let the final point of pj; be Py. 


e Similarly, there is a unique path p>. : .% — R, with initial point P,, 
such that fo f2 = po, and with final point P, say. 


e And so on. 


The concatenation of these paths f; in R is the lift p of p with initial point 
P. 


There is a similar proof of “unique deformation lifting” that leads to 
the following result. Suppose p and q are paths from A to B in! and p is 
deformable to q with endpoints fixed. Then the lift p of p with initial point 
A is deformable to the lift G of q with initial point A with endpoints fixed. 


Now we are finally ready to prove that S! is not simply connected. In 
particular, we can prove that the upper semicircle path 


p(t) =(coszt,sinzt) from (1,0) to (—1,0) 
is not deformable to the lower semicircle path 
q(t) = (cos(—zt),sin(—zrt)) from (1,0) to (—1,0). 


This is because the lift of p with initial point 0 has final point 77, whereas 
the lift g@ of g with initial point 0 has final point —7. Hence there is no de- 
formation of / to g with the endpoints fixed, and therefore no deformation 
of p to q with endpoints fixed. 


Exercises 


To see why the spheres S‘ with k > 1 are simply connected, first consider the 
ordinary sphere S?. 


8.7.1 Explain why, in a sufficiently small region of S?, there is a unique “line” 
between any two points. 
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8.7.2 Use the uniqueness of “lines” in a small region Z to define a deformation 
of any curve p from A to B in & to the “line” from A to B. 


Exercises 8.7.1 and 8.7.2, together with uniform continuity, allow any curve p on 
S? to be deformed into a “spherical polygon,” which can then be projected onto a 
curve on the plane. 

It is geometrically obvious that there is a homeomorphism from S* — {P} 
onto R? for any point P € S*. Namely, choose coordinates so that P is the north 
pole (0,0, 1) and map S? — {P} onto R? by stereographic projection, as shown in 
Figure 8.5. 


P 


Figure 8.5: Stereographic projection. 


To generalize this idea to any S* we have to describe stereographic projection 
algebraically. So consider the S‘ in R‘*!, defined by the equation 
Mtagte + xp41 = 1. 
We project S* stereographically from the “north pole” P = (0,0,...,0, 1) onto the 
subspace R* with equation x, = 0. 


8.7.3 Verify that the line through P and any other point (a;,a2,...,ax41) € S* has 
parametric equations 


M=ait, A=art, ..., xAe=at, Xe =14+ (Gey —1)t. 


8.7.4 Show that the line in Exercise 8.7.3 meets the hyperplane x,,; = 0 where 


a a2 ak 
x] a Xk 


= —,, = ; = ———. 
1~agyy 1 agyi 1 apy 


8.7.5 By solving the equations in Exercise 8.7.4, or otherwise, show that 


2x1 2X 
Qo = 
xi fe fax? +1 xi fe $3241 
d 
- eae =] 
a1 = S—_$— ; : 
apt taxz4+1 


Hence conclude that stereographic projection is a homeomorphism. 
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8.8 Discussion 


Closed, connected sets can be extremely pathological, even in R?. For 
example, consider the set called the Sierpinski carpet, which consists of 
the unit square with infinitely many open squares removed. Figure 8.6 
shows what it looks like after several stages of construction. The original 
unit square was black, and the white “holes” are where squares have been 
removed. In reality, the total area of removed squares is 1, so the carpet is 
“almost all holes.” Nevertheless, it is a closed, path-connected set. 


Figure 8.6: The Sierpinski carpet 


Remarkably, imposing the condition that the closed set be a continu- 
ous group removes any possibility of pathology, at least in the spaces of 
n Xn matrices. As von Neumann [1929] showed, a closed subgroup G of 
GL(n,C) has a neighborhood of 1 that can be mapped to a neighborhood 
of 0 in some Euclidean space by the logarithm function, so G is certainly 
not full of holes. Also G is smooth, in the sense of having a tangent space 
at each point. 

Thus in the world of matrix groups it is possible to avoid the technical- 
ities of smooth manifolds and work with the easier concepts of open sets, 
closed sets, and continuous functions. 

In this book we avoid the concept of smooth manifold; indeed, this is 
one of the great advantages of restricting attention to matrix Lie groups. 
But we have, of course, investigated “smoothness” as manifested by the 
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existence of a tangent space at the identity (and hence at every point) for 
each matrix Lie group G. As we saw in Chapter 7, every matrix Lie group 
G has a tangent space 7;(G) at the identity, and 7(G) equals some R*. 
Even finite groups, such as G = {1}, have a tangent space at the identity; 
not surprisingly it is the space R°. 

Topology gives a way to describe all the matrix Lie groups with zero 
tangent space: they are the discrete groups, where a group H is called 
discrete if there is a neighborhood of 1 not containing any elements of G 
except 1 itself. Every finite group is obviously discrete, but there are also 
infinite discrete groups; for example, Z is a discrete subgroup of R. The 
groups Z and R can be viewed as matrix groups by associating each x € R 
with the matrix ie 7) (because multiplying two such matrices results in 
addition of their x entries). 

It follows immediately from the definition of discreteness that 7)(H) = 
{0} for a discrete group H. It also follows that if H is a discrete subgroup 
of a matrix Lie group G then G/H is “locally isomorphic” to G in some 
neighborhood of 1. This is because every element of G in some neighbor- 
hood of 1 belongs to a different coset. From this we conclude that G/H 
and G have the same tangent space at 1, and hence the same Lie alge- 
bra. This result shows, once again, why Lie algebras are simpler than Lie 
groups—they do not “see” discrete subgroups. 

Apart from the existence of a tangent space, there is an algebraic reason 
for including the discrete matrix groups among the matrix Lie groups: they 
occur as kernels of “Lie homomorphisms.” Since everything in Lie theory 
is supposed to be smooth, the only homomorphisms between Lie groups 
that belong to Lie theory are the smooth ones. We will not attempt a general 
definition of smooth homomorphism here, but merely give an example: the 
map ® : R — S! defined by 


0(0) =e. 


This is surely a smooth map because ® is a differentiable function of @. 
The kernel of this ® is the discrete subgroup of R (isomorphic to Z) con- 
sisting of the integer multiples of 277. We would like any natural aspect of a 
Lie “thing” to be another Lie “thing,” so the kernel of a smooth homomor- 
phism ought to be a Lie group. This is an algebraic reason for considering 
the discrete group Z to be a Lie group. 

The concepts of compactness, path-connectedness, simple connected- 
ness, and coverings play a fundamental role in topology, as a glance at any 
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topology book will show. Their role in Lie theory is also fundamental, and 
in fact Lie theory provides some of the best illustrations of these concepts. 
The covering of S! by R is one, and we will see more in the next chapter. 


Closed paths in SO(3) 


The group SO(3) of rotations of R°? is a striking example of a matrix Lie 
group that is not simply connected. We exhibit a closed path in SO(3) that 
cannot be deformed to a point in an informal demonstration known as the 
“plate trick.” 

Imagine carrying a plate of soup in one hand, keeping the plate hori- 
zontal to avoid spilling. Now rotate the plate through 360°, returning it to 
its original position in space (first three pictures in Figure 8.7). 


Figure 8.7: The plate trick. 


The series of positions of the plate up to this stage may be regarded as 
a continuous path in SO(3). This is because each position is determined by 
an “axis” in IR? (the vector from the shoulder to the hand) and an “angle” 
(the angle through which the plate has turned). This path in SO(3) is closed 
because the initial and final points are the same (axis, angle) pair. We can 
“deform” the path by varying the position of the arm and hand between 
the initial and final positions. But it seems intuitively clear that we cannot 
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deform the path to a single point—because the path creates a full twist in 
the arm, which cannot be removed by varying the path between the initial 
and final positions. 

However, traversing (a deformation of) the path again, as shown in 
the last three pictures, returns the arm and hand to their initial untwisted 
state! The topological meaning of this trick is that there is a closed path 
p in SO(3) that cannot be deformed to a point, whereas p? (the result of 
traversing p twice) can be deformed to a point. This topological property, 
appropriately called torsion, is actually characteristic of projective spaces, 
of which SO(3) is one. As we saw in Sections 2.2 and 2.3, SO(3) is the 
same as the real projective space RP°. 


9 


Simply connected Lie groups 


PREVIEW 


Throughout our exposition of Lie algebras we have claimed that the struc- 
ture of the Lie algebra g of a Lie group G captures most, if not all, of the 
structure of G. Now it is time to explain what, if anything, is lost when we 
pass from G to g. The short answer is that topological information is lost, 
because the tangent space g cannot reveal how G may “wrap around” far 
from the identity element. 

The loss of information is already apparent in the case of R, O(2), and 
SO(2), all of which have the line as tangent space. A more interesting case 
is that of O(3), SO(3), and SU(2), all of which have the Lie algebra 50(3). 
These three groups are not isomorphic, and the differences between them 
are best expressed in topological language, because the differences persist 
even if we distort O(3), SO(3), and SU(2) by continuous 1-to-1 maps. 

First, O(3) differs topologically from SO(3) and SU(2) because it is 
not path-connected; there are two points in O(3) not connected by a path 
in O(3). Second, SU(2) differs topologically from SO(3) in being simply 
connected; that is, any closed path in SU(2) can be shrunk to a point. 

We elaborate on these properties of O(3), SO(3), and SU(2) in Sec- 
tions 9.1 and 9.2. Then we turn to the relationship between homomor- 
phisms of Lie groups and homomorphisms of Lie algebras: a Lie group ho- 
momorphism ® : G — H “induces” a Lie algebra homomorphism @: g — h 
and if G and H are simply connected then ~ uniquely determines ®. This 
leads to a definitive result on the extent to which a Lie algebra g “deter- 
mines” its Lie group G: all simply connected groups with the same Lie 
algebra are isomorphic. 
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9.1 Three groups with tangent space IR 


The groups O(2) and SO(2) have the same tangent space, namely the tan- 
gent line at the identity in SO(2), because the elements of O(2) not in 
SO(2) are far from the identity and hence have no influence on the tangent 
space. Figure 9.1 gives a geometric view of the situation. 

The group SO(2) is shown as a circle, because SO(2) can be modeled 
by the circle {z : |z] = 1} in the plane of complex numbers. Its complement 
O(2) — SO(2) is the coset R- SO(2), where R is any reflection of the plane 
in a line through the origin. We can also view R-SO(2) as a circle (lying 
somewhere in the space of 2 x 2 real matrices), since multiplication by R 
produces a continuous 1-to-1 image of SO(2). The circle O(2) — SO(2) is 
disjoint from SO(2) because distinct cosets are always disjoint. In partic- 
ular, O(2) — SO(2) does not include the identity, so the tangent to O(2) at 
the identity is simply the tangent to SO(2) at 1: 


T(O(2)) = T1(SO(2)). 


T;(SO(2)) 
0(2) — SO(2) $O(2) 


Figure 9.1: Tangent space of both SO(2) and O(2). 


As a vector space, the tangent has the same structure as the real line 
R (addition of tangent vectors is addition of numbers, and scalar multiples 
are real multiples). The tangent also has a Lie bracket operation, but not an 
interesting one, because XY = YX for X,Y € R, so 


[X,Y] =XY-—YX =0 forallX,YER. 


Another Lie group with the same trivial Lie algebra is R itself (under the 
addition operation). It is clear that R is its own tangent space. 


188 9 Simply connected Lie groups 


Thus we have three Lie groups with the same Lie algebra: O(2), SO(2), 
and R. These groups can be distinguished algebraically in various ways 
(exercises), but the most obvious differences between them are topological: 


e O(2) is not path-connected. 


e SO(2) is path-connected but not simply connected, that is, there is a 
closed path in SO(2) that cannot be continuously shrunk to a point. 


e R is path-connected and simply connected. 


Another difference is that both O(2) and SO(2) are compact, that is, closed 
and bounded, and R is not. 

As this chapter unfolds, we will see that the properties of compactness, 
path-connectedness, and simple connectedness are crucial for distinguish- 
ing between Lie groups with the same Lie algebra. These properties are 
“squeezed out” of the Lie group G when we form its Lie algebra g, and 
we need to put them back in order to “reconstitute” G from g. In partic- 
ular, we will see in Section 9.6 that G can be reconstituted uniquely from 
g if we know that G is simply connected. But before looking at simple 
connectedness more closely, we study another example. 


Exercises 


9.1.1 Find algebraic properties showing that the groups O(2), SO(2), and R are 
not isomorphic. 


From the circle group S' = SO(2) and the line group R we can construct three 
two-dimensional groups as Cartesian products: S! x S', S! x R, and R x R. 


9.1.2 Explain why it is appropriate to call these groups the torus, cylinder, and 
plane, respectively. 


9.1.3 Show that the three groups have the same Lie algebra. Describe its under- 
lying vector space and Lie bracket operation. 


9.1.4 Distinguish the three groups algebraically and topologically. 


9.2 Three groups with the cross-product Lie algebra 


At various points in this book we have met the groups O(3), SO(3), and 
SU(2), and observed that they all have the same Lie algebra: IR? with the 
cross product operation. Their Lie algebra may also be viewed as the space 
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Ri+ Rj+ Rk of pure imaginary quaternions, with the Lie bracket operation 
defined in terms of the quaternion product by 


[X,Y] =XY-YX. 


The groups O(3) and SO(3) differ in the same manner as O(2) and SO(2), 
namely, SO(3) is path-connected and O(3) is not. In fact, SO(3) is the 
connected component of the identity in O(3): the subset of O(3) whose 
members are connected to the identity by paths. 

Thus O(3) and SO(3) (like O(2) and SO(2)) have the same tangent 
space at the identity simply because all members of O(3) near the identity 
are members of SO(3). The reason that SO(3) and SU(2) have the same 
tangent space is more subtle, and it involves a phenomenon not observed 
among the one-dimensional groups O(2) and SO(2): the covering of one 
compact group by another. 

As we saw in Section 2.3, each element of SO(3) (a rotation of R*) 
corresponds to an antipodal point pair +q of unit quaternions. If we rep- 
resent q and —q by 2 x 2 complex matrices, they are elements of SU(2). 
It follows, as we observed in Section 6.1, that SO(3) and SU(2) have the 
same tangent vectors at the identity. However, the 2-to-1 map of SU(2) 
onto SO(3) that sends the two antipodal quaternions g and —q to the single 
pair +q creates a topological difference between SU(2) and SO(3). 

The group SU(2) is the 3-sphere S* of quaternions g at unit distance 
from O in H = R’, and the 3-sphere is simply connected. To see why, 
suppose p is a closed path in R° and suppose that N is a point of S? not on 
p. There is a continuous 1-to-1 map of S* — {N} onto R? with a continuous 
inverse, namely stereographic projection II (see the exercises in Section 
8.7). It is clear that the loop II(p) can be continuously shrunk to a point 
in R*, for example, by magnifying its size by 1 —1 at time t for 0 <r <1. 
Hence the same is true of p by mapping the shrinking process back into S° 
by 1", 

In contrast, the space SO(3) of antipodal point pairs --g, for q € S°, 
is not simply connected. An informal explanation of this property is the 
“plate trick” described in Section 8.8. More formally, consider a path p(s) 
in S? that begins at 1 and ends at —1, that is, 6(0) = 1 and p(1) = —1. 
Then the point pairs +f(s) for 0 < s < 1 form a closed path p in SO(3) 
because +f(0) and +f(1) are the same point pair +1. Now, if p can be 
continuously shrunk to a point, then p can be shrunk to a point keeping the 
initial point +1 fixed (consider the shrinking process relative to this point). 
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It follows (by “deformation lifting” as in Section 8.7) that the correspond- 
ing curve pf on S? can be shrunk to a point, keeping its endpoints 1 and —1 
fixed. But this is absurd, because the latter are two distinct points. 

To sum up, we have: 


e The three compact groups O(3), SO(3), and SU(2) have the same 
Lie algebra. 


e SO(3) and SU(2) are connected but O(3) is not. 


e SU(2) is simply connected but SO(3) is not. 


The space SU(2) is said to be a double-covering of SO(3) because there 
is a continuous 2-to-1 map of SU(2) onto SO(3) that is locally 1-to-1, 
namely the map q+> {+q}. This map is locally 1-to-1 because the only 
point, other than q, that goes to {+q} is the point —g, and a sufficiently 
small neighborhood of q does not include —g. Thus the quaternions q’ in a 
sufficiently small neighborhood of g in SU(2) correspond 1-to-1 with the 
pairs {+q'} in a neighborhood of {+q} in SO(3). 

It turns out that all the groups SU(n) and Sp(n) are simply connected, 
and all the groups SO(n) for n > 3 are doubly covered by simply connected 
groups. Thus simply connected groups arise naturally from the classical 
groups. They are the “topologically simplest” among the groups with a 
given Lie algebra. The other thing to understand is the relationship between 
Lie group homomorphisms (such as the 2-to-1 map of SU(2) onto SO(3) 
just mentioned) and Lie algebra homomorphisms. This is the subject of the 
next section. 


Exercises 


A more easily visualized example of a non-simply-connected space with simply 
connected double cover is the real projective plane RP”, which consists of the 
antipodal point pairs +P on the ordinary sphere S?. Consider the path p on S* 
that goes halfway around the equator, from a point Q to its antipodal point —Q. 


9.2.1 Explain why the corresponding path +p on RP”, consisting of the point 
pairs +P for P € p, is a closed path on RP”. 


9.2.2. Suppose that +p can be deformed on RP” to a single point. Draw a picture 
that illustrates the effect of a small “deformation” of +p on the correspond- 
ing set of points on S?. 


9.2.3 Explain why a deformation of +p on RP” to a single point implies a defor- 
mation of p to a pair of antipodal points on S?, which is impossible. 
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9.3. Lie homomorphisms 


In Section 2.2 we defined a group homomorphism to be a map ® : G — H 
such that ®(g; 92) = ®(g;)®(g2) for all g1,g2 € G. In the case of Lie 
groups G and H, where the group operation is smooth, it is appropriate that 
® preserve smoothness as well, so we define a Lie group homomorphism 
to be a smooth map ®: G > H such that ®(g1g2) = ®(g1)®(g2) for all 
81,82 €G. 

Now suppose that G and H are matrix Lie groups, with Lie algebras 
(tangent spaces at the identity) 7)(G) = g and 7(H) = b, respectively. Our 
fundamental theorem says that a Lie group homomorphism ® : G — H 
“induces” (in a sense made clear in the statement of the theorem) a Lie 
algebra homomorphism ¢~ : g — 4, that is, a linear map that preserves the 
Lie bracket. 

The induced map @ is the “obvious” one that associates the initial ve- 
locity A’(0) of a smooth path A(t) through 1 in G with the initial velocity 
(®oA)’(0) of the image path ®(A(r)) in H. It is not completely obvious 
that this map is well-defined; that is, it is not clear that if A(0) = B(O0) =1 
and A’(0) = B’(0) then (Bo A)'(0) = (WoB)’(0). But we can sidestep this 
problem by defining a smooth map ®: G — H to be one for which the 
correspondence A‘(0) ++ (®oA)'(0) is a well-defined and linear map from 
T1(G) to T,(H). 

Then it remains only to prove that @ preserves the Lie bracket, and we 
have already done most of this in proving the Lie algebra properties of the 
tangent space in Section 5.4. 

For the sake of brevity, we will use the term “Lie homomorphism” for 
both Lie group homomorphisms and Lie algebra homomorphisms. 


The induced homomorphism. For any Lie homomorphism ® : G — H of 
matrix Lie groups G, H, with Lie algebras g, h, respectively, there is a Lie 
homomorphism @ : g — 4 such that 


(A'(0)) = (BoA)'(0) 


for any smooth path A(t) through 1 in G. 


Proof. Thanks to our definition of a smooth map 9, it remains only to 
prove that @ preserves the Lie bracket, that is, 


e[A’(0), B'(0)] = [p(A"(0)), p(B'(0))] 
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for any smooth paths A(t), B(t) in G with A(O) = B(0) = 1. 

We do this, as in Section 5.4, by considering the smooth path in G 

C,(t) =A(s)B(t)A(s)' for a fixed value of s. 

The ®-image of this path in H is 

@(C,(t)) = ® (A(s)B()A(s)*) = P(A(s)) -P(BU)) -@(A(s))~! 
because ® is a group homomorphism. As we calculated in Section 5.4, 

C,(0) = A(s)B'(0)A(s)~ € g, 

sO 


_d 
~ adt|,.o 
= (0A)(s)-(@oB)'(0)-(B0A)(s)" Eh. 


(A(s)) -P(B(t)) -®(A(s)) 


As s varies, C/(0) traverses a smooth path in g and @(C/(0)) traverses 
a smooth path in §. Therefore, by the linearity of @, 


@ (tangent to C,(0) at s =0) = (tangent to p(C{(0)) ats=0). — (*) 
Now we know from Section 5.4 that the tangent to C{(0) at s = 0 is 
A’(0)B'(0) — B'(0)A'(0) = [A"(0),B'(0)). 
A similar calculation shows that the tangent to @(C{(0)) at s = 0 is 


(®0A)'(0) - (®oB)'(0) — (oB)’(0)-(@oA)'(0) 


So it follows from (*) that 


e|A’(0),B'(0)] = [p(A"(0)), e(B"O))], 


as required. 


If ©: G > H is a Lie isomorphism, then ®~! : H > G is also a Lie 
isomorphism, and it maps any smooth path through 1 in H back to a smooth 
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path through 1 in G. Thus ® maps the smooth paths through 1 in G onto 
all the smooth paths through 1 in H, and hence @ is onto h. 

It follows that the Lie homomorphism 9’ induced by ®~! is from 
into g. And since g’ sends (Bo A)'(0) in h to (B®! oo A)’ (0), that is, to 
A'(0), we have g’ = g~!. In other words, g is an isomorphism of g onto 
h, and so isomorphic Lie groups have isomorphic Lie algebras. 

The converse statement is not true, but it is “nearly” true. In Section 
9.6 we will show that groups G and H with isomorphic Lie algebras are 
themselves isomorphic if they are simply connected. The proof uses paths 
in G to “lift” a homomorphism from g in “small steps.” This necessitates 
further study of paths and their compactness, which we carry out in the 
next two sections. 


The trace homomorphism revisited 
In Sections 6.2 and 6.3 we have already observed that the map 
Tr:g—-C 


of a real or complex Lie algebra g is a Lie algebra homomorphism. This 
result also follows from the theorem above, because the trace is the Lie 
algebra map induced by the det homomorphism for real or complex Lie 
groups (Section 2.2) thanks to the formula 


det(e4) = e™4) 


of Section 5.3. 


Exercises 


Defining a smooth map to be one that induces a linear map of the tangent space, so 
that we don’t have to prove this fact, is an example of what Bertrand Russell called 
“the advantages of theft over honest toil” (in his Introduction to Mathematical 
Philosophy, Routledge 1919, p. 71). We may one day have to pay for it by having 
to prove that some “obviously smooth” map really is smooth by showing that it 
really does induce a linear map of the tangent space. 

I made the definition of smooth map ® : G —> H mainly to avoid proving that 
the map @ : A’(0) + (Bo A)'(0) is well-defined. (That is, if A’(0) = B’(0) then 
(®oA)'(0) = (Bo B)'(0).) If we assume that ¢ is well-defined, then, to prove that 
@ is linear, we need only assume that ® maps smooth paths to smooth paths. The 
proof goes as follows. 

Consider the path C(t) = A(t)B(t), where A(t) and B(t) are smooth paths with 
A(0) = B(0) = 1. Then we know from Section 5.4 that C’(0) = A’(0) + B’(0). 
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9.3.1 Using the fact that ® is a group homomorphism, show that we also have 
(Mo C)’(0) = (Wo A)'(0) + (Wo B)'(0). 


9.3.2 Deduce from Exercise 9.3.1 that @(A’(0) + B’(0)) = p(A’(0)) + @(B’(0)). 


9.3.3 Let D(t) = A(rt) for some real number r. Show that D’(0) = rA’(0) and 
(®oD){(0) = r(@oA)'(0). 


9.3.4 Deduce from Exercises 9.3.2 and 9.3.3 that @ is linear. 


9.4 Uniform continuity of paths and deformations 


The existence of space-filling curves shows that a continuous image of the 
unit interval [0, 1] may be very “tangled.” Indeed, the image of an arbitrar- 
ily short subinterval may fill a whole square in the plane. Nevertheless, the 
compactness of [0, 1] ensures that the images of small segments of [0,1] are 
“uniformly” small. This is formalized by the following theorem, an easy 
consequence of the uniform continuity of continuous functions on compact 
sets from Section 8.5. 


Uniform continuity of paths. Jf p : [0,1] — R" is a path, then, for any 
€ > 0, it is possible to divide [0,1] into a finite number of subintervals, 
each of which is mapped by p into an open ball of radius €. 


Proof. The interval [0,1] is compact, by the Heine—Borel theorem of Sec- 
tion 8.4, so p is uniformly continuous by the theorem of Section 8.5. In 
other words, for each € > 0 there is a 6 > 0 such that |p(Q) — p(R)| < € 
for any points Q,R € [0, 1] such that |Q— R| < 6. 

Now divide [0,1] into subintervals of length < 6 and pick a point Q in 
each subinterval (say, the midpoint). Each subinterval is mapped by p into 
the open ball with center p(Q) and radius € because, if R is in the same 
subinterval as Q, we have |Q — R| < 6, and hence |p(Q) — p(R)| < €. 


The same proof applies in two dimensions, almost word for word. 


Uniform continuity of path deformations. [fd : [0,1] x [0,1] — R” is a 
path deformation, then, for any € > 0, it is possible to divide the square 
(0, 1] x [0,1] into a finite number of subsquares, each of which is mapped 
by d into an open ball of radius €. 


Proof. The square [0,1] x [0,1] is compact, by the generalized Heine— 
Borel theorem of Section 8.4, so d is uniformly continuous by the theorem 
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of Section 8.5. In other words, for each € > 0 there is a 6 > O such that 
|p(Q) — p(R)| < € for any points Q,R € [0, 1] x [0, 1] such that |O—R| < 6. 

Now divide [0,1] x [0,1] into subsquares of diagonal < 6 and pick a 
point Q in each subsquare (say, the center). Each subsquare is mapped by 
d into the open ball with center p(Q) and radius € because, if R is in the 
same subsquare as Q, we have |Q — R| < 6 and hence |p(Q) — p(R)| < €. 


Exercises 


9.4.1 Show that the function f(x) = 1/x is continuous, but not uniformly contin- 
uous, on the open interval (0, 1). 


9.4.2 Give an example of continuous function that is not uniformly continuous 
on GL(2,C). 


9.5 Deforming a path in a sequence of small steps 


The proof of uniform continuity of path deformations assumes only that d 
is acontinuous map of the square into IR”. We now need to recall how such 
a map is interpreted as a “path deformation.” The restriction of d to the 
bottom edge of the square is one path p, the restriction to the top edge is 
another path q, and the restriction to the various horizontal sections of the 
square is a “continuous series” of paths between p and g—a deformation 
from p to q. Figure 9.2 shows the “deformation snapshots” of Figure 8.3 
further subdivided by vertical sections of the square, thus subdividing the 
square into small squares that are mapped to “deformed squares” by d. 


Figure 9.2: Snapshots of a path deformation. 
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The subdivision of the square into small subsquares is done with the 
following idea in mind: 


e By making the subsquares sufficiently small we can ensure that their 
images lie in €-balls of IR” for any prescribed €. 


e The bottom edge of the unit square can be deformed to the top 
edge by a finite sequence of deformations d;;, each of which is the 
identity map of the unit square outside a neighborhood of the (i, j)- 
subsquare. 


e It follows that if p can be deformed to qg then the deformation can 
be divided into a finite sequence of steps. Each step changes the 
image only in a neighborhood of a “deformed square,” and hence in 
an €-ball. 


To make this argument more precise, though without defining the d;; in 
tedious detail, we suppose the effect of a typical dj; on the (i, j)-subsquare 
to be shown by the snapshots shown in Figure 9.3. In this case, the bottom 
and right edges are pulled to the position of the left and top edges, respec- 
tively, by “stretching” in a neighborhood of the bottom and right edges and 
“compressing” in a neighborhood of the left and top. This deformation 
will necessarily move some points in the neighboring subsquares (where 
such subsquares exist), but we can make the affected region outside the 
(i, j)-subsquare as small as we please. Thus dj; is the identity outside a 
neighborhood of, and arbitrarily close to, the (i, 7)-subsquare. 


ie 
J 


Figure 9.3: Deformation dj; of the (i, j)-subsquare. 


Now, if the (1,1)-subsquare is the one on the bottom left and there are 
n subsquares in each row, we can move the bottom edge to the top through 
the sequence of deformations d11,d12,...,din,dan,.-.,do1,d31, .... Figure 
9.4 shows the first few steps in this process when n = 4. 

Since each d;; is a map of the unit square into itself, equal to the identity 
outside a neighborhood of an (i, j)-subsquare, the composite map d 0 dj; 
(“d;; then d”) agrees with d everywhere except on a neighborhood of the 
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Figure 9.4: Sequence deforming the bottom edge to the top. 


image of the (i, j)-subsquare. Intuitively speaking, do dj; moves one side 
of the image subsquare to the other, while keeping the image fixed outside 
a neighborhood of the image subsquare. 

It follows that if d is a deformation of path p to path q and dj; runs 
through the sequence of maps that deform the bottom edge of the unit 
square to the top, then the sequence of composite maps do d;; deforms 
p to q, and each do dj; agrees with d outside a neighborhood of the image 
of the (i, j)-subsquare, and hence outside an €-ball. 

In this sense, if a path p can be deformed to a path q, then p can be 
deformed to q in a finite sequence of “small” steps. 


Exercises 


9.5.1 Ifa<0<1<)b, give a continuous map of (a,b) onto (a,b) that sends 0 to 
1. Use this map to define dj; when the (i, j)-subsquare is in the interior of 
the unit square. 


9.5.2 If 1 <b give a continuous map of [0,b) onto [1,b) that sends 0 to 1, and 
use it (and perhaps also the map in Exercise 9.5.1) to define d;; when the 
(i, j)-subsquare is one of the boundary squares of the unit square. 


9.6 Lifting a Lie algebra homomorphism 


Now we are ready to achieve the main goal of this chapter: showing that 
if g and h are the Lie algebras of simply connected Lie groups G and H, 
respectively, then each Lie algebra homomorphism @ : g — h is induced by 
a Lie group homomorphism ® : G — H. This is the converse of the theorem 
in Section 9.3, and the two theorems together show that the structure of 
simply connected Lie groups is completely captured by their Lie algebras. 
The idea of the proof is to “lift” the homomorphism @ from g to G in small 
pieces, with the help of the exponential function and the Campbell—Baker— 
Hausdorff theorem of Section 7.7. 
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We already know, from Section 7.4, that there is a neighborhood .V 
of 1 in G that is the 1-to-1 image of a neighborhood of 0 in g under the 
exponential function. We also know, by Campbell—Baker—Hausdorff, that 
the product of two elements e* ,e” € G is given by a formula 


1 5 : 
eet = Xt ts [X ,Y]+further Lie bracket terms ; 


Therefore, if we define ® on each element e* of G by ®(e*) = e?), then 
D(e* ey) —@® Ga [X .Y]+further Lie bracket =) 
= eG (X +Y +5 [X ,Y]+further Lie bracket terms) 
= el P(X)+9(¥) +3 [o(X),@(Y)]|+further Lie bracket terms) 


because @ is a Lie algebra homomorphism 
= e9%)e9") by Campbell—Baker—Hausdorff 
= O(e*)@(e"). 


Thus ® is a Lie group homomorphism, at least in the region .” where 
every element of G is of the form e*. However, not all elements of G are 
necessarily of this form, so we need to extend ® to an arbitrary A € G by 
some other means. This is where we need the simple connectedness of G, 
and we carry out a four-stage process, explained in detail below. 


1. Connect A to 1 by a path, and show that there is a sequence of points 
1=A;, Ao, ..., Am=A 
along the path such that A1,A,'A2,-.-,Ar! Am all lie in ./, and 


m—1 


hence such that all of ®(A;),®(A; 'A2),...,®(A>" Am) are defined. 


m—1 


Motivated by the fact that A = A, -Ay'Ad toes “Aa! Am we let 
@(A) = ®(A1)®(Ay Az)» (4; Am): 


m—1 


2. Show that ®(A) does not change when the sequence A,,A2,...,Am is 
“refined” by inserting an extra point. Since any two sequences have 
a common refinement, obtained by inserting extra points, the value 
of ®(A) is independent of the sequence of points along the path. 


3. Show that ®(A) is also independent of the path from 1 to A, by show- 
ing that ®(A) does not change under a small deformation of the path. 
(Simple connectedness of G ensures that any two paths from 1 to A 
may be made to coincide by a sequence of small deformations.) 
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4. Check that ® is a group homomorphism, and that it induces the Lie 
algebra homomorphism @. 


Stage 1. Finding a sequence of points 1 = A,,Ao,...,Am =A. 


In Section 8.6 we showed how to do this, under the title of “gen- 
erating a path-connected group from a neighborhood of 12’ We found 
1 = Aj,Ap,...,Am =A so that A; and Aj+; lie in the same set A(t;)@, 
where @ is an open subset of -% small enough that C; "Cag iW for 
any Cj,Ci+1 © 0. 

Then ®(A;) = ®(1) is defined, and so is ®(A;'A;1) for each i. 


Stage 2. Independence of the sequence along the path. 
Suppose that A‘ is another point on the path A(f), in the same neigh- 
borhood A(t 4) @ as A; and A;,;. When the sequence is refined from 
AjyesjApApp<.5Ay TO Ajy... Ai, A; Aint enegAdns 


the expression for (A) is changed by replacing the factor ®(A; 'A;,1) by 
the two factors ®(A;'A/)@(A‘~'A;,1). Then both A; A! and A‘~'A;1, are 
in @, and so 


D(A; AOA, 'Ais1) = ®(4; AA; Ais) 
because ® is ahomomorphism on @ 
= (4; 'Ai+1). 


Hence insertion of an extra point does not change the value of ®(A). 
Stage 3. Independence of the path. 


Given paths p and q from 1 to A, we know that p can be deformed to g 
because G is simply connected. Let d : [0,1] x [0,1] — G be a deformation 
from p to q. Each point P in the unit square has a neighborhood 


N(P) = {Q:d(P)"'d(Q) € 7}, 


which is open by the continuity of d and matrix multiplication. Inside N(P) 
we choose a square neighborhood S(P) with center P and sides parallel 
to the sides of the unit square. Then the unit square is contained in the 
union of these square neighborhoods, and hence in a finite union of them, 
S(P,) US(P2) U---US(P,), by compactness. 
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Let € be the minimum side length of the finitely many rectangular over- 
laps of the squares S(P;) covering the unit square. Then, if we divide the 
unit square into equal subsquares of some width less than €, each subsquare 
lies in a square S(P;). Therefore, for any two points P, Q in the subsquare, 
we have d(P)~!d(Q) €.Y. 

This means that we can deform p to g by “steps” (as described in the 
previous section) within regions of G where the point d(P) inserted or re- 
moved in each step is such that d(P)~'d(Q) € -¥ for its neighbor vertices 
d(Q) on the path, so ®(d(P)~'d(Q)) is defined. Consequently, ® can be 
defined along the path obtained at each step of the deformation, and we can 
argue as in Stage 2 that the value of ® does not change. 


Stage 4. Verification that ® is a homomorphism that induces @. 
Suppose that A,B € G and that 1 = A),A2,...,A =A is a sequence of 
points such that Ar Ain 1 € @ for each i, so 
@(A) = O(A1) (A; 'A2)--- O(4,, 


m—1 


Aya) 


Similarly, let 1 = B,,Bz,...,B, = B be a sequence of points such that 
By! Biss € @ for each i, so 


©(B) = ©(B,)®(B;'By) --®(B,! Bn). 


Now notice that 1 = A,,A2,...,A = AB,,AB2,...,AB, is a sequence of 
points, leading from 1 to AB, such that any two adjacent points lie in a 
neighborhood of the form C@. Indeed, if the points B; and Bj, both lie in 
C@ then AB; and AB; both lie in AC@. It follows that 
(AB) = (A) (A; 'A2)---B(A;,! Am) 
x @((AB,)~'!ABz)®((AB2)~'AB3)--- ®((AB,_1) 'ABn) 
= @(A;)®(A,'A2)--- ®(A,,! Am) 
x ®(B,'B)®(Bz'B3)---®(B_' By) 


n—-1 


= ®(A)®(B) because ®(B;) = ®(1) = 1. 


Thus ® is a homomorphism. 

To show that ® induces @ it suffices to show this property on -/, be- 
cause we have shown that there is only one way to extend ® beyond ./. 
On .V, ®(A) = e9('"8(4)), so for the path e’* through 1 in G we have 


¢ O(e'*) = 


@: a t9(X) _ g(x). 
dt |, dt|,_o. on) 


t=0 
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Thus ® induces the Lie algebra homomorphism @. 
Putting these four stages together, we finally have the result: 


Homomorphisms of simply connected groups. Jf g and h are the Lie 
algebras of the simply connected Lie groups G and H, respectively, and if 
@:g— bis a homomorphism, then there is a homomorphism ® : G — H 
that induces @. 


Corollary. If G and H are simply connected Lie groups with isomorphic 
Lie algebras g and h, respectively, then G is isomorphic to H. 


Proof. Suppose that @ : g — is a Lie algebra isomorphism, and let the 
homomorphism that induces p be ®: G — H. Also, let '¥ : H — G be the 
homomorphism that induces g~!. It suffices to show that ¥ = ®~!, since 
this implies that ® is a Lie group isomorphism. 

Well, it follows from the definition of the “lifted” homomorphisms that 
‘Yo@®: G— Gis the unique homomorphism that induces the identity map 
g-'og:g—g, hence ¥ o@ is the identity map on G. In other words, 
=o, 


9.7 Discussion 


The final results of this chapter, and many of the underlying ideas, are due 
to Schreier [1925] and Schreier [1927]. In the 1920s, understanding of 
the connections between group theory and topology grew rapidly, mainly 
under the influence of topologists, who were interested in discrete groups 
and covering spaces. Schreier was the first to see clearly that topology is 
important in Lie theory and that it separates Lie algebras from Lie groups. 
Lie algebras are topologically trivial but Lie groups are generally not, and 
Schreier introduced the concept of covering space to distinguish between 
Lie groups with the same Lie algebra. He pointed out that every Lie group 
G has a universal covering G — G, the unique continuous local isomor- 
phism of a simply connected group onto G. Examples are the homomor- 
phisms R — S! and SU(2) — SO(3). In general, the universal covering is 
constructed by “lifting,’ much as we did in the previous section. 

The universal covering construction is inverse to the construction of the 
quotient by a discrete group because the kernel of G — G is a discrete sub- 
group of G, known to topologists as the fundamental group of G, ™(G). 
Thus G is recovered from G as the quotient G/m;(G) = G. Another im- 
portant result discovered by Schreier [1925] is that 2\(G) is abelian for a 
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Lie group G. This result strongly constrains the topology of Lie groups, 
because the fundamental group of an arbitrary smooth manifold can be any 
finitely presented group. A “random” smooth manifold has a nonabelian 
fundamental group. 

Like the quotient construction (see Section 3.9), the universal cover- 
ing can produce a nonmatrix group G from a matrix group G. A famous 
example, essentially due to Cartan [1936], is the universal covering group 


———_ 


SL(2,C) of the matrix group SL(2,C). Thus topology provides another 
path to the world of Lie groups beyond the matrix groups. 

Topology makes up the information lost when we pass from Lie groups 
to Lie algebras, and in fact topology makes it possible to bypass Lie alge- 
bras almost entirely. A notable book that conducts Lie theory at the group 
level is Adams [1969], by the topologist J. Frank Adams. It should be said, 
however, that Adams’s approach uses topology that is more sophisticated 
than the topology used in this chapter. 


Finite simple groups 


The classification of simple Lie groups by Killing and Cartan is a remark- 
able fact in itself, but even more remarkable is that it paves the way for the 
classification of finite simple groups—a much harder problem, but one that 
is related to the classification of continuous groups. Surprisingly, there are 
finite analogues of continuous groups in which the role of R or C is played 
by finite fields."' 

As mentioned in Section 2.8, finite simple groups were discovered by 
Galois around 1830 as a key concept for understanding unsolvability in the 
theory of equations. Galois explained solution of equations by radicals as a 
process of “symmetry breaking” that begins with the group of all symme- 
tries of the roots and factors it into smaller groups by taking square roots, 
cube roots, and so on. The process first fails with the general quintic equa- 
tion, where the symmetry group is Ss, the group of all 120 permutations of 
five things. The group S5; may be factored down to the group As of the 60 
even permutations of five things by taking a suitable square root, but it is 
not possible to proceed further because As is a simple group. 

More generally, A,, is simple for n > 5, so Galois had in fact discovered 
an infinite family of finite simple groups. Apart from the infinite family of 


This brings to mind a quote attributed to Stan Ulam: The infinite we can do right away, 
the finite will take a little longer. 
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cyclic groups of prime order, the finite simple groups in the other infinite 
families are finite analogues of Lie groups. Each infinite matrix Lie group 
G spawns infinitely many finite groups, obtained by replacing the matrix 
entries in elements of G by entries from a finite field, such as the field of 
integers mod 2. There is a finite field of size g for each prime power q, so 
infinitely many finite groups correspond to each infinite matrix Lie group 
G. These are called the finite groups of Lie type. 

It turns out that each simple Lie group yields infinitely many finite sim- 
ple groups in this way. So, alongside the family of alternating groups, we 
have a family of simple groups of Lie type for each simple Lie group. The 
finite simple groups that fall outside these families are therefore even more 
exceptional than the exceptional Lie groups. They are called the sporadic 
groups, and there are 26 of them. The story of the sporadic simple groups 
is a long one, filled with so many amazing episodes that it is impossible 
to sketch it here. Instead, I recommend the book Ronan [2006] for an 
overview, and Thompson [1983] for a taste of the mathematics. 
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preserves closure, 168 
homomorphism 
det, 31, 107 
induced, 121, 191 
Lie, 191 
of groups, 23, 28 
of Lie algebras, 120, 186, 191 
of Lie groups, 183, 186, 191 
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of S? x S? onto SO(4), 42 
of S? onto SO(3), 23 
of simply connected groups, 201 
onto quotient group, 28 
theorem for groups, 30, 107 
trace, 193 

homotopy see deformation 177 

Hopf fibration, 26 

hyperplane, 36 


ideal, 116 

as image of normal subgroup, 117, 

118 

as kernel, 120 

definition, 117 

in gl(n,C), 122 

in ring theory, 117 

in s0(4), 123 

in u(n), 124 

origin of word, 117 
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complex two-square, 11 

eight-square, 22 

four-square, 11 

Jacobi, 13 

two-square, 6 
identity component, 54, 174 

is a subgroup, 54 

is normal subgroup, 175 

of O(3), 189 
infinitesimal elements, 45, 113 
inner product, 83 

and angle, 49 

and distance, 49, 54 

and orthogonality, 49 

Hermitian, 55 

preservation criterion, 55 
on C”, 54 


on Hi", 57 
on R3, 13 
on R”, 48 


definition, 49 
intermediate value theorem, 46 
inverse function theorem, vili 
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inverse matrix, 6 
invisibility, 72, 114, 150 
isometry, 7 


and the multiplicative property, 11 


as product of reflections, 18, 36 
is linear, 37 
of R4, 12 
of the plane, 12 
orientation-preserving, 36, 38, 48 
orientation-reversing, 38 
isomorphism 
local, 183 
of groups, 23, 29 
of simply connected Lie 
groups, 186 
of sp(1) x sp(1) onto s0(4), 123 
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holds for cross product, 13 
holds for Lie bracket, 83 
Jacobson, Nathan, 113 
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of covering map, 201 
of group homomorphism, 29 
of Lie algebra homomorphism, 
120 
Killing form, viii, 83 
Killing, Wilhelm, 115, 137, 202 
Kummer, Eduard, 117 


Lagrange, Joseph Louis, 11 

length, 49 

Lie algebras, vii, 13 
are topologically trivial, 201 
as “infinitesimal groups”, 46 
as tangent spaces, 74, 104, 114 
as vector spaces over C, 108 
as vector spaces over R, 107 
definition, 82 
exceptional, viii, 46, 115, 137 
matrix, 105 
named by Weyl, 113 
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non-simple, 122 
of classical groups, 113, 118 
quaternion, 111 
semisimple, 138 
simple, 46, 114 
definition, 116, 118 


Lie bracket, 74, 80 


and commutator, 105 
determines group operation, 152 
of pure imaginary quaternions, 80 
of skew-symmetric matrices, 98 
on the tangent space, 104 
reflects conjugation, 74, 81, 98 
reflects noncommutative content, 
80 

Lie groups, vii, | 
abelian, 41 
almost simple, 115 
as smooth manifolds, 114 
classical, 80, 93, 113 
compact, 88, 92, 159, 160 
definition of, 3 
exceptional, viii, 22, 45, 46 
matrix see matrix Lie groups 81 
noncommutative, 1 
noncompact, 88, 92, 110 
nonmatrix, 72, 113, 202 
of rotations, 22 
path-connected, 160, 175 
simple, 48, 115 

classification of, 202 

simply connected, 160, 186 
two-dimensional, 88, 188 

Lie homomorphisms, 191 

Lie polynomial, 154 

Lie theory, vii 
and quaternions, 22 
and topology, 73, 115 

Lie, Sophus, 45, 80 
and exponential map, 91 
concept of simplicity, 115 
knew classical Lie algebras, 113, 
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knew SO(4) anomaly, 47 
Transformationsgruppen, 47 
Lie-type finite groups, 203 
lifting, 179 
a deformation, 180 
a Lie algebra homomorphism, 197 
a path, 179 
limit point, 4, 162 
linear transformations, 2 
group of, 3 
of H, 39 
of H”, 57 
orthogonal, 3 
preserving inner product, 48, 49 
on C", 55 
preserving length, 49, 161 
preserving orientation, 48, 50 
locus, 173 
log see logarithm function 139 
logarithm function 
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multiplicative property, 
141, 146 
produces tangents, 139 


M,(C), 108 
M,(H), 111 
M,,(R), 93 
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Riemannian, 92 
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submultiplicative property, 84 
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criterion for rotation, 50 
dilation, 51 
exponential function, 84 
definition, 86 
groups, Vii 
inverse, 6 
Lie algebra, 105 
Lie group see matrix Lie groups 81 
orthogonal, 32, 51, 97 
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product properties, 8 
quaternion, 57, 112 
representation of H, 7 
discovered by Cayley, 10 
representation of C, 5 
representation of linear functions, 
27, 87 
sequence, 161 
skew-Hermitian, 96, 99 
skew-symmetric, 93, 96, 99 
special orthogonal, 50 
transpose, 10, 58 
unitary, 32 
upper triangular, 100 
matrix group, 3 
abelian, 41 
closed, 143, 164 
Lie see matrix Lie groups 81 
quotient, 72 
smoothness of, 3 
matrix Lie groups, 4, 81, 113 
and topology, 160 
are closed, 164 
are smooth manifolds, 147 
as subgroups of GL(n,C), 
160, 165 
closed under limits, 4, 88, 
139, 147 
defined by von Neumann, 158 
definition, 4, 143, 166 
include finite groups, 114 
spawn finite groups, 203 
matrix logarithm see logarithm function 
139 
maximal abelian subgroup, 66 
maximal torus, 48, 60 
in GL(n,C), 111 
in SL(n,C), 111 
in SO(2m), 64 
in SO(2m + 1), 64, 65 
in SO(3), 60 
in Sp(n), 66 
in SU(n), 66 
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in U(n), 65 

introduced by Weyl, 72 
Mercator, Nicholas, 141 
Minkowski space, 113 
Montgomery, Deane, 159 
multiplicative notation, 24 
multiplicative property 

of absolute value, 6, 9, 20, 22 

and isometries, 11 

of determinants, 6, 9 

of logarithm, 141, 146 

of triangles, 18 


neighborhood, 147, 162 
topological, 149 

Newton, Isaac, 141 

Noether, Emmy, 117 

nth roots of matrices, 144 


O(3) is not path-connected, 186 
O(n), 48 
definition, 51 
is not path-connected, 52 
octonion, 22 
automorphisms, 45 
projective plane, 22 
open 
ball, 161 
interval, 163 
set, 160, 162 
in general topology, 162 
sets, 161 
orientation, 38, 50 
and determinant, 38 
preservation of, 38 
reversal of, 38 
orthogonal 
complement 
in M,(C), 148 
of real quaternions, 12, 14 
group, 48, 51 
special, 3, 48, 51 
matrix, 32, 97 
transformation, 3, 49 


vectors in R?, 13 
orthonormal basis 
of C”, 55 


path, 94, 160 
as a function, 173 
as locus, 173 
as orbit, 173 
as sequence of positions, 52 
closed, 178, 184 
and simple connectivity, 
178, 184 
concatenation, 174 
definition, 174 
deformation of, 160 
lifting, 179 
smooth, 4, 114 
definition, 93, 94 
of matrices, 94 
of quaternions, 79, 81 
path-connectedness, 48, 52, 60, 
160, 174 
and center, 69 
and concept of rotation, 52 
of GL(n,C), 111, 175 
of SL(n,C), 111 
of SO(n), 52 
of Sp(n), 57, 60 
of SU(n), 56 
of U(n), 69 
Peano, Giuseppe, 173 
plate trick, 184, 189 
Pontrjagin, Lev, 114 
product 
Cartesian, 40 
direct, 40 
of matrices, 2 
of triangles, 18 
product rule, 79 
projective line 
real, 31 
projective plane 
octonion, 22 
real, 190 
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projective space, 185 
real, 32, 33, 185 


quantum physics, 46 
quaternions, vii, 7 
absolute value of, 7 
is multiplicative, 9 
algebra of, 1, 6 
discovered by Hamilton, 10 
is skew field, 21 
roles in Lie theory, 22 
and reflections of R*, 38 
and rotations, 10, 14, 39 
and SO(4), 23 
automorphisms of, 44 
conjugate, 9, 58 
inverse, 9 
matrix representation, 7 
product of, 1, 7 
is noncommutative, 8 
pure imaginary, 12 
as tangent vectors, 79 
exponentiation of, 60, 77 
spaces of, 22 
unit, 10, 14 
3-sphere of, 10 
and SO(3), 33 
antipodal, 15 
group of, 10 
quotient group, 23, 72 
definition, 28 
homomorphism onto, 28 


R3 
as a Lie algebra, 82, 119, 188 
as quaternion subspace, 12 
rotations of, 10 
R*, 10 
reflections of, 38 
rotations of, 23, 36 
and quaternions, 39 
tiling by 24-cells, 36 
R”,3 
isometries of, 18 
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as products of reflections, 36 
rotations of, 3 
reflections, 16 
and isometries of R”, 36 
in great circles, 17 
in hyperplanes, 18, 52 
linearity of, 38 
of R*, 38 
reverse orientation, 38 
representation theory, viii 
Riemannian manifolds, 92 
rigid motion see isometry 11 
Rodrigues, Olinde, 21 
root systems, viii, 137 
rotations, Vii 
and quaternions, 10, 14, 15, 35 
are isometries, 38 
are orientation-preserving, 38 
are orthogonal, 3, 49 
as product of reflections, 16 
form a group, 16 
generalized, 59 
infinitesimal, 46 
of plane, 2 
and complex numbers, 3 
of R?, 10 
and quaternions, 15 
of R4, 23 
and quaternions, 39 
of R”, 3 
definition, 49 
of space, | 
and quaternions, 3, 14 
do not commute, 9 
of tetrahedron, 34 
RP!, 31 
RP”, 190 
RP?, 32, 33, 185 
Russell, Bertrand, 193 
Ryser, Marc, 37 


s',1 
as a group, 1, 32 
is not simply connected, 180 
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S*, 32 is noncompact, 92 

not a Lie group, 32 not the image of exp, 92, 111, 177 
S3, 10 universal covering of, 202 

as a group, |, 10, 32 SL(n,C), 108, 109 

as a matrix group, 32 is closed in M,(C), 166 

as special unitary group, 32 is noncompact, 110 

homomorphism onto SO(3), 23 is path-connected, 111 

Hopf fibration of, 26 sl(n,C), 109 

is not a simple group, 23, 32 smoothness, 3, 4, 182 

is simply connected, 189 and exponential function, 93, 166 
S", 32 and the tangent space, 183 
scalar product see inner product 13 effected by group structure, 166 


Schreier, Otto, 73, 115, 150, 201 


of finite groups, 114 
semisimplicity, 47 


: of homomorphisms, 183, 191 
__ of Lie algebras, 138 of manifolds, 3, 114, 182 
Sierpinski carpet, 182 of matrix groups, 4 
simple connectivity, 160, 177 of matrix Lie groups, 147 
and isomorphism, 186 of matrix path, 94 
defined via closed paths, 178 of path, 4, 79, 93, 94 
OF a ee of sequential tangency, 146 
of R*, 178 
of S*, 178 woe 
of SU(2), 186, 189 as image of exp, 74 
of SU(n) and Sp(n), 190 dense subgroup of, 70 
simplicity is not simply connected, 179, 188 
path-connectedness, 53 


and solvability, 45 : ) 
Lie’s concept of, 115 SO(2m) is not simple, 46, 72 


of As, 45, 202 SO(2m + 1) is simple, 46, 70 
of An, 45, 202 SO(3), 3 
of cross-product algebra, 119 and unit quaternions, 33 
of groups, 31 as Aut(H), 44 
of Lie algebras, viii, 46, 115 center of, 61, 151 
definition, 116 is not simply connected, 184, 186, 
of Lie groups, 48, 115 189 
of sl(n,C), 125 is simple, 23, 33, 118, 151 
of SO(2m+ 1), 46 Lie algebra of, 46 
of SO(3), 33, 118, 151 same tangents as SU(2), 118, 189 
of s0(3), 46, 118, 151 50(3), 46 
of so(n) forn > 4, 130 simplicity of, 46, 118, 151 
of sp(m), 133 SO(4), 23 
of su(n), 126 and quaternions, 23 
skew field, 21 anomaly of, 47 


SL(2,C), 92 is not simple, 23, 44, 122 
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is semisimple, 47 
so(4) 
is direct product, 132, 138 
is not simple, 122 
SO(n), 3, 48 
definition, 51 
geometric simplicity proof, 46 
path-connectedness, 52 
Sp(1), 57 
equals SU(2), 57 
Sp(7), 48, 57 
complex form, 58 
is not simple, 72 
is path-connected, 57 
is simply connected, 190 
sp(n), 112 
is simple, 133 
space-filling curve, 173 
special orthogonal group, 48, 51 
special relativity, 113 
special unitary group, 32, 48, 55 
sphere 
1-dimensional, 1 
group structure, | 
3-dimensional, 1, 10 
group structure, 10 
n-dimensional, 32 
with continuous group structure, 73 
stereographic projection, 26, 181, 189 
SU(2), 1, 32 
as image of exp, 74 
homomorphism onto SO(3), 33 
is not simple, 33 
is simply connected, 186 
Lie algebra of, 74 
same tangents as SO(3), 118, 189 
tangent space, 78, 79, 118, 189 
su(2), 74, 82 
Lie bracket, 82 
SU(n), 48 
definition, 55 
is not simple, 72 
is simply connected, 190 
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path-connectedness, 56 
subgroup, 23, 24 

discrete, 69 

discrete normal, 69 

nondiscrete, 69 

normal, 27 

self-conjugate see normal 28 
submultiplicative property, 84, 91 
symplectic 

origin of word, 71 
symplectic form, 71 
symplectic group, 48, 57 


T,(G) see tangent space 93 
tangent space, 4, 72, 74, 93 
algebraic properties, 103 
as indicator of smoothness, 4, 183 
closed under Lie bracket, 104 
dimension, 4, 107, 149 
induced linear map of, 191, 193 
is a vector space, 103 
linearizes Lie groups, 98 
of a normal subgroup, 117 
of classical group, 82 
of discrete group, 183 
of GL(n,C), 108 
of Riemannian manifold, 92 
of SL(n,C), 110 
of SO(2), 74 
of SO(3), 98, 118 
of SO(n), 96, 97 
of Sp(n), 99 
of SU(2), 74, 78, 79, 118 
of SU(n), 101 
of U(n), 99 
tangent vector, 4, 93 
exponentiation of, 143 
of matrix group, 94 
of O(n), 95 
of SO(n), 93 
of Sp(1), 95 
of U(n), 95 
sequential, 145, 166 
smoothness of, 146 
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Tartaglia, Nicolo, 19 
tetrahedron, 34 
theorem 
Ado, 72, 105 
Bolzano—Weierstrass, 170, 171 
Campbell—Baker—Hausdorff, viii, 
139, 152 
Cartan—Dieudonné, 37 
Cayley—Hamilton, 111 
four square, 11 
Heine—Borel, 169, 194 
intermediate value, 46 
on group homomorphisms, 
30, 107 
topology, viii, 160 
as theory of continuity, 
46, 160 
general, 162 
in Lie theory, 73, 115 
of Lie group, 92 
relative, 163, 165 
torsion, 185 
torus, 41, 188 


in S?, 26 
maximal, 48 
surface, 41 
totally disconnected, 70 
trace, 100 
as Lie algebra homomorphism, 
120, 122, 193 
homomorphism induced by det, 
193 


kernel of, 122 
same for XY and YX, 103 
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triangle inequality, 85 
two-square identity, 6 
complex, 11 


U(n), 48 
definition, 55 
is not simple, 107 
is path-connected, 69 


u(n) 

is not simple, 124 
Ulam, Stan, 202 
unitary group, 48, 55 


vector product see cross product 13 
vector space 
over C, 108, 124 
over R, 82, 103, 106, 107 
velocity vector, 79 
Viéte, Francois, 18 
von Neumann, John, viii, 158 
and Hilbert’s fifth problem, 159 
and matrix exponentiation, 91 
concept of tangent, 114, 145 
theorem on exponentiation, 92 
theory of matrix Lie groups, 158 
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